Page 1 of 1

Advanced Python Generators

#1 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Post icon  Posted 28 July 2012 - 07:48 PM

Tutorial: Advanced Python Generators
Hey everybody, I've been in a snippet writing mood lately, and I started writing some generators. scalt's tutorial on generators is a nice introduction to generators, but this tutorial will go into more advanced use cases of generators. The focus of this tutorial will be generators involving enormously huge, and even infinite, sets.

Generators are great because they allow you to iterate over a potentially huge list of numbers without having to store the list in its entirety in memory. This quality can make generators very handy, particularly in resource critical situations. Consider this situation: you want to compute the sum of all values from 1 to 100,000,000. How would you do this? You might try this:
total = 0
for i in range(1,100000001):
    total += i


However, this lacks a certain elegance that we've come to expect in Python. For those of you more familiar with list comprehensions, you might think that using one is the way to go:
total = sum([i for i in range(1,100000001)])


But when I ran that, my computer slowed down to a crawl. Who would have thought allocating space in memory for one hundred million integers would slow things down? A better solution is necessary, enter the generator!
total = sum(i for i in range(1,100000001)) #With python 2.x, you'll want to use xrange()


By using a generator, we evaluate total 1 value at a time (lazily). By doing it this way, we don't have to allocate space for 100,000,000 integers in memory at the same time. Don't misunderstand, this code will take a while to compute (it did on mine at least), but, it's far faster and less resource intensive than the comprehension approach. My generator found the total in about 30 seconds on my machine, the comprehension has taken something around 10 minutes.

As you can see, generators provide an elegant solution to a real problem. However, this is not what I love most about generators. What I love most about generators is their utility in infinite sets. In the snippets section, you'll find quite a few generators for infinite sets that I've submitted so far. One tried and true snippet is the Fibonacci generator:
__author__ = "atraub"
__date__= "6/13/2012"


#optionally supply the first two values of generator to seed it
def fibonacciGenerator(previous=0,current=1):
    """Creates a Fibonacci Generator"""

    #handle the first two values
    if previous > current:
        previous,current = current, previous
        
    yield previous
    yield current


    #handle all subsequent values 
    while True:
        current,previous = previous+current,current
        yield current



This generator will infinitely generate fibonacci values. Here's an example of the Fibonacci generator in action (ran from the Python shell):
>>> x = fibonacciGenerator()
>>> for i in range(20):
	print(next(x))

	
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181


In this example, we create a generator and then use it to find the first 20 Fibs. If we want the next 20, we can simply run the loop again and our generator will pick right back up where we left off. Another cool aspect of generators is that they can be paired with list comprehensions, as you can see in the following interactive session:
>>> x = fibonacciGenerator()
>>> y = [next(x) for i in range(20)]
>>> print(y)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]



This is pretty cool. Sometimes, a generator can't easily be expressed in a simple function. For this, Python allows us to make object oriented Generators. For this, I created a generator to create the values of the Kolakoski sequence.
__author__ = "atraub"
__date__= "6/09/2012"

class Kolakoski:
    """Class based generator for the Kolakoski sequence"""
    def __init__(self):
        """Sets up the generator"""
        self.sequence = []
        self.i = 0
        self.currentNum = 1
        self.nextNum = 2

    def __next__(self):
        """Implements next() in Python 3"""
        return self.__realNext()

    def next(self):
        """Implements next() in Python 2"""
        return self.__realNext()

    def __realNext(self):
        """Underlying function to get the next value in the sequence"""
        if self.i == 0:
            """Base case for first value"""
            retVal = [1]
        elif self.i == 1:
            """Base case for second value"""
            retVal = [2,2]
        else:
            """Default case for all values after the first 2"""
            retVal = self.__getNext()

        self.__update(retVal)
        return retVal
    
    def __getNext(self):
        """Gets the next value after the first two"""
        return [self.currentNum for a in range(self.sequence[self.i])]

    def __update(self,retVal):
        """Handles upkeep of the generator"""
        self.i += 1 #increments the index
        self.sequence.extend(retVal) #Adds newest value to underlying sequence
        self.currentNum,self.nextNum = self.nextNum,self.currentNum #sets the current number
    
    def __str__(self):
        """Creates a string representation of the underlying sequence"""
        return ",".join([str(a) for a in self.sequence])

    def getLast(self):
        """gets the last value generated by the sequence"""
        return self.last


Object oriented generators aren't extremely tough. The one nuance is that Python3 requires a method called __next__ and python2 requires the method be named next. This isn't the time or place to walk through the code itself, but it's not overly complex once you have a firm understanding of how the Kolakoski sequence is generated. Despite the different appearance of this generator, calling on it is still syntactically identical. Unfortunately, the Kolakoski sequence still requires an underlying list to generate values, and as of this iteration I wasn't able to find a way to generate values without it, thus, this generator is not ideal for infinite sets. (On a side note, I did develop a version that could save memory by removing elements from the underlying list that were no longer necessary to generate new values, however, the nature of the sequence dictates that even with this optimization, the list will still grow infinitely huge).

I hope that through this tutorial, you've seen that generators do have a lot of power and utility. Whether you want to generate values from infinite sets or perform calculations on massive sets of values, a generator may be the solution you've been looking for.

This post has been edited by atraub: 29 July 2012 - 08:42 AM


Is This A Good Question/Topic? 2
  • +

Replies To: Advanced Python Generators

#2 LionCode  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 1
  • Joined: 30-July 12

Posted 30 July 2012 - 12:07 AM

Nice stuff :)

Star generator?
try:
	lines_of_stars = int(raw_input('Lines of stars:'))
	stars_p_lines = int(raw_input('Stars per line:'))
	for line in range(0, lines_of_stars):
		for star in range(0, stars_p_lines):
			print "*",
		print "\n"
							
except ValueError:
	print "I only accept integers *angry face*"



Was This Post Helpful? 0
  • +
  • -

#3 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Posted 30 July 2012 - 09:44 PM

That's a nice snippet of code, but unfortunately it's not a true generator. Generators are objects in Python, so for something like this, it would have to exist inside a function. That being said, it does "generate" the desired output :)
Was This Post Helpful? 0
  • +
  • -

#4 fatihmert  Icon User is offline

  • New D.I.C Head

Reputation: -1
  • View blog
  • Posts: 46
  • Joined: 04-March 12

Posted 15 July 2014 - 11:00 PM

What is different to range & xrange?
Was This Post Helpful? 0
  • +
  • -

#5 andrewsw  Icon User is online

  • Fire giant boob nipple gun!
  • member icon

Reputation: 3317
  • View blog
  • Posts: 11,214
  • Joined: 12-December 12

Posted 16 July 2014 - 11:44 AM

View Postfatihmert, on 16 July 2014 - 06:00 AM, said:

What is different to range & xrange?

Look these up in the documentation, or a book or tutorial somewhere.

If you still don't understand then make an attempt to describe what is confusing you. But not in this thread, your question will have little to do with this specific tutorial.

This post has been edited by andrewsw: 16 July 2014 - 11:46 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1