0 Replies - 8404 Views - Last Post: 21 May 2017 - 10:47 PM

#1 jon.kiparsky   User is online

  • Beginner
  • member icon


Reputation: 11474
  • View blog
  • Posts: 19,538
  • Joined: 19-March 11

Break a list into chunks of size n

Posted 21 May 2017 - 10:47 PM

I came across this in a stackoverflow post, and I thought it was so cool I had to share it here.

The problem is to take a list of arbitrary length and return a list of the chunks of size n. Thus, if we have

>>> l = list(range(20))


we would like to be able to do this:

>>> chunkify(l, 5)
[(0, 1, 2, 3, 4), (5, 6, 7, 8, 9) , (10, 11, 12, 13, 14), (15, 16, 17, 18, 19)]



(It doesn't really matter whether the sublists are lists or tuples, but in this case we'll produce tuples)

The obvious way to do this would be to loop over the list or something. I won't even bother writing that out because ugh. The second obvious way to do that would be with a list comp, something like

>>> n = 5
>>> [l[x:x+n] for x in xrange(0, len(data), n)]
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]]


This works, and there's nothing very wrong with it, but this here is a little hipper:

>>> list(zip(*[iter(l)]*n))
[(0, 1, 2, 3, 4), (5, 6, 7, 8, 9), (10, 11, 12, 13, 14), (15, 16, 17, 18, 19)]


Okay, this is clear as regex, right?

What we're doing, from the inside out, is this:

iter(l)                  # create an iterator over l   
[iter(l)]                # put it in a single-element list
[iter(l)]*n              # duplicate it n times, making a list of five pointers to the same object
zip(*[iter(l)]*5)        # using the star operator to turn this list into an args list, zip the n references to that object together into as many groups of n as the original l will produce
list(zip(*[iter(l)]*5))  # evaluate this zipped list, forcing repeated evaluation of the single iterator in groups of size n


Yeah, it's totally funky, and it doesn't make a lot of sense when you first see it, but thinking this through will be an educational experience for most python programmers, even relatively advanced ones.

But in the end, we can define chunkify as
def chunkify(l, n)
    return list(zip(*[iter(l)]*n))


(strictly speaking we could return just the zip, omitting the last evaluation step, and this would be more lazy in python 3... for homework, explain why)

Is This A Good Question/Topic? 3
  • +

Page 1 of 1