|
1 |
|
2 :mod:`itertools` --- Functions creating iterators for efficient looping |
|
3 ======================================================================= |
|
4 |
|
5 .. module:: itertools |
|
6 :synopsis: Functions creating iterators for efficient looping. |
|
7 .. moduleauthor:: Raymond Hettinger <python@rcn.com> |
|
8 .. sectionauthor:: Raymond Hettinger <python@rcn.com> |
|
9 |
|
10 |
|
11 .. testsetup:: |
|
12 |
|
13 from itertools import * |
|
14 |
|
15 .. versionadded:: 2.3 |
|
16 |
|
17 This module implements a number of :term:`iterator` building blocks inspired by |
|
18 constructs from the Haskell and SML programming languages. Each has been recast |
|
19 in a form suitable for Python. |
|
20 |
|
21 The module standardizes a core set of fast, memory efficient tools that are |
|
22 useful by themselves or in combination. Standardization helps avoid the |
|
23 readability and reliability problems which arise when many different individuals |
|
24 create their own slightly varying implementations, each with their own quirks |
|
25 and naming conventions. |
|
26 |
|
27 The tools are designed to combine readily with one another. This makes it easy |
|
28 to construct more specialized tools succinctly and efficiently in pure Python. |
|
29 |
|
30 For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a |
|
31 sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and |
|
32 :func:`count` which can be combined to form ``imap(f, count())`` and produce an |
|
33 equivalent result. |
|
34 |
|
35 Likewise, the functional tools are designed to work well with the high-speed |
|
36 functions provided by the :mod:`operator` module. |
|
37 |
|
38 Whether cast in pure python form or compiled code, tools that use iterators are |
|
39 more memory efficient (and often faster) than their list based counterparts. Adopting |
|
40 the principles of just-in-time manufacturing, they create data when and where |
|
41 needed instead of consuming memory with the computer equivalent of "inventory". |
|
42 |
|
43 |
|
44 .. seealso:: |
|
45 |
|
46 The Standard ML Basis Library, `The Standard ML Basis Library |
|
47 <http://www.standardml.org/Basis/>`_. |
|
48 |
|
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard |
|
50 Libraries <http://www.haskell.org/definition/>`_. |
|
51 |
|
52 |
|
53 .. _itertools-functions: |
|
54 |
|
55 Itertool functions |
|
56 ------------------ |
|
57 |
|
58 The following module functions all construct and return iterators. Some provide |
|
59 streams of infinite length, so they should only be accessed by functions or |
|
60 loops that truncate the stream. |
|
61 |
|
62 |
|
63 .. function:: chain(*iterables) |
|
64 |
|
65 Make an iterator that returns elements from the first iterable until it is |
|
66 exhausted, then proceeds to the next iterable, until all of the iterables are |
|
67 exhausted. Used for treating consecutive sequences as a single sequence. |
|
68 Equivalent to:: |
|
69 |
|
70 def chain(*iterables): |
|
71 # chain('ABC', 'DEF') --> A B C D E F |
|
72 for it in iterables: |
|
73 for element in it: |
|
74 yield element |
|
75 |
|
76 |
|
77 .. function:: itertools.chain.from_iterable(iterable) |
|
78 |
|
79 Alternate constructor for :func:`chain`. Gets chained inputs from a |
|
80 single iterable argument that is evaluated lazily. Equivalent to:: |
|
81 |
|
82 @classmethod |
|
83 def from_iterable(iterables): |
|
84 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F |
|
85 for it in iterables: |
|
86 for element in it: |
|
87 yield element |
|
88 |
|
89 .. versionadded:: 2.6 |
|
90 |
|
91 |
|
92 .. function:: combinations(iterable, r) |
|
93 |
|
94 Return *r* length subsequences of elements from the input *iterable*. |
|
95 |
|
96 Combinations are emitted in lexicographic sort order. So, if the |
|
97 input *iterable* is sorted, the combination tuples will be produced |
|
98 in sorted order. |
|
99 |
|
100 Elements are treated as unique based on their position, not on their |
|
101 value. So if the input elements are unique, there will be no repeat |
|
102 values in each combination. |
|
103 |
|
104 Equivalent to:: |
|
105 |
|
106 def combinations(iterable, r): |
|
107 # combinations('ABCD', 2) --> AB AC AD BC BD CD |
|
108 # combinations(range(4), 3) --> 012 013 023 123 |
|
109 pool = tuple(iterable) |
|
110 n = len(pool) |
|
111 indices = range(r) |
|
112 yield tuple(pool[i] for i in indices) |
|
113 while 1: |
|
114 for i in reversed(range(r)): |
|
115 if indices[i] != i + n - r: |
|
116 break |
|
117 else: |
|
118 return |
|
119 indices[i] += 1 |
|
120 for j in range(i+1, r): |
|
121 indices[j] = indices[j-1] + 1 |
|
122 yield tuple(pool[i] for i in indices) |
|
123 |
|
124 The code for :func:`combinations` can be also expressed as a subsequence |
|
125 of :func:`permutations` after filtering entries where the elements are not |
|
126 in sorted order (according to their position in the input pool):: |
|
127 |
|
128 def combinations(iterable, r): |
|
129 pool = tuple(iterable) |
|
130 n = len(pool) |
|
131 for indices in permutations(range(n), r): |
|
132 if sorted(indices) == list(indices): |
|
133 yield tuple(pool[i] for i in indices) |
|
134 |
|
135 .. versionadded:: 2.6 |
|
136 |
|
137 .. function:: count([n]) |
|
138 |
|
139 Make an iterator that returns consecutive integers starting with *n*. If not |
|
140 specified *n* defaults to zero. Often used as an argument to :func:`imap` to |
|
141 generate consecutive data points. Also, used with :func:`izip` to add sequence |
|
142 numbers. Equivalent to:: |
|
143 |
|
144 def count(n=0): |
|
145 # count(10) --> 10 11 12 13 14 ... |
|
146 while True: |
|
147 yield n |
|
148 n += 1 |
|
149 |
|
150 |
|
151 .. function:: cycle(iterable) |
|
152 |
|
153 Make an iterator returning elements from the iterable and saving a copy of each. |
|
154 When the iterable is exhausted, return elements from the saved copy. Repeats |
|
155 indefinitely. Equivalent to:: |
|
156 |
|
157 def cycle(iterable): |
|
158 # cycle('ABCD') --> A B C D A B C D A B C D ... |
|
159 saved = [] |
|
160 for element in iterable: |
|
161 yield element |
|
162 saved.append(element) |
|
163 while saved: |
|
164 for element in saved: |
|
165 yield element |
|
166 |
|
167 Note, this member of the toolkit may require significant auxiliary storage |
|
168 (depending on the length of the iterable). |
|
169 |
|
170 |
|
171 .. function:: dropwhile(predicate, iterable) |
|
172 |
|
173 Make an iterator that drops elements from the iterable as long as the predicate |
|
174 is true; afterwards, returns every element. Note, the iterator does not produce |
|
175 *any* output until the predicate first becomes false, so it may have a lengthy |
|
176 start-up time. Equivalent to:: |
|
177 |
|
178 def dropwhile(predicate, iterable): |
|
179 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1 |
|
180 iterable = iter(iterable) |
|
181 for x in iterable: |
|
182 if not predicate(x): |
|
183 yield x |
|
184 break |
|
185 for x in iterable: |
|
186 yield x |
|
187 |
|
188 |
|
189 .. function:: groupby(iterable[, key]) |
|
190 |
|
191 Make an iterator that returns consecutive keys and groups from the *iterable*. |
|
192 The *key* is a function computing a key value for each element. If not |
|
193 specified or is ``None``, *key* defaults to an identity function and returns |
|
194 the element unchanged. Generally, the iterable needs to already be sorted on |
|
195 the same key function. |
|
196 |
|
197 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It |
|
198 generates a break or new group every time the value of the key function changes |
|
199 (which is why it is usually necessary to have sorted the data using the same key |
|
200 function). That behavior differs from SQL's GROUP BY which aggregates common |
|
201 elements regardless of their input order. |
|
202 |
|
203 The returned group is itself an iterator that shares the underlying iterable |
|
204 with :func:`groupby`. Because the source is shared, when the :func:`groupby` |
|
205 object is advanced, the previous group is no longer visible. So, if that data |
|
206 is needed later, it should be stored as a list:: |
|
207 |
|
208 groups = [] |
|
209 uniquekeys = [] |
|
210 data = sorted(data, key=keyfunc) |
|
211 for k, g in groupby(data, keyfunc): |
|
212 groups.append(list(g)) # Store group iterator as a list |
|
213 uniquekeys.append(k) |
|
214 |
|
215 :func:`groupby` is equivalent to:: |
|
216 |
|
217 class groupby(object): |
|
218 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B |
|
219 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D |
|
220 def __init__(self, iterable, key=None): |
|
221 if key is None: |
|
222 key = lambda x: x |
|
223 self.keyfunc = key |
|
224 self.it = iter(iterable) |
|
225 self.tgtkey = self.currkey = self.currvalue = object() |
|
226 def __iter__(self): |
|
227 return self |
|
228 def next(self): |
|
229 while self.currkey == self.tgtkey: |
|
230 self.currvalue = self.it.next() # Exit on StopIteration |
|
231 self.currkey = self.keyfunc(self.currvalue) |
|
232 self.tgtkey = self.currkey |
|
233 return (self.currkey, self._grouper(self.tgtkey)) |
|
234 def _grouper(self, tgtkey): |
|
235 while self.currkey == tgtkey: |
|
236 yield self.currvalue |
|
237 self.currvalue = self.it.next() # Exit on StopIteration |
|
238 self.currkey = self.keyfunc(self.currvalue) |
|
239 |
|
240 .. versionadded:: 2.4 |
|
241 |
|
242 |
|
243 .. function:: ifilter(predicate, iterable) |
|
244 |
|
245 Make an iterator that filters elements from iterable returning only those for |
|
246 which the predicate is ``True``. If *predicate* is ``None``, return the items |
|
247 that are true. Equivalent to:: |
|
248 |
|
249 def ifilter(predicate, iterable): |
|
250 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9 |
|
251 if predicate is None: |
|
252 predicate = bool |
|
253 for x in iterable: |
|
254 if predicate(x): |
|
255 yield x |
|
256 |
|
257 |
|
258 .. function:: ifilterfalse(predicate, iterable) |
|
259 |
|
260 Make an iterator that filters elements from iterable returning only those for |
|
261 which the predicate is ``False``. If *predicate* is ``None``, return the items |
|
262 that are false. Equivalent to:: |
|
263 |
|
264 def ifilterfalse(predicate, iterable): |
|
265 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8 |
|
266 if predicate is None: |
|
267 predicate = bool |
|
268 for x in iterable: |
|
269 if not predicate(x): |
|
270 yield x |
|
271 |
|
272 |
|
273 .. function:: imap(function, *iterables) |
|
274 |
|
275 Make an iterator that computes the function using arguments from each of the |
|
276 iterables. If *function* is set to ``None``, then :func:`imap` returns the |
|
277 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is |
|
278 exhausted instead of filling in ``None`` for shorter iterables. The reason for |
|
279 the difference is that infinite iterator arguments are typically an error for |
|
280 :func:`map` (because the output is fully evaluated) but represent a common and |
|
281 useful way of supplying arguments to :func:`imap`. Equivalent to:: |
|
282 |
|
283 def imap(function, *iterables): |
|
284 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000 |
|
285 iterables = map(iter, iterables) |
|
286 while True: |
|
287 args = [it.next() for it in iterables] |
|
288 if function is None: |
|
289 yield tuple(args) |
|
290 else: |
|
291 yield function(*args) |
|
292 |
|
293 |
|
294 .. function:: islice(iterable, [start,] stop [, step]) |
|
295 |
|
296 Make an iterator that returns selected elements from the iterable. If *start* is |
|
297 non-zero, then elements from the iterable are skipped until start is reached. |
|
298 Afterward, elements are returned consecutively unless *step* is set higher than |
|
299 one which results in items being skipped. If *stop* is ``None``, then iteration |
|
300 continues until the iterator is exhausted, if at all; otherwise, it stops at the |
|
301 specified position. Unlike regular slicing, :func:`islice` does not support |
|
302 negative values for *start*, *stop*, or *step*. Can be used to extract related |
|
303 fields from data where the internal structure has been flattened (for example, a |
|
304 multi-line report may list a name field on every third line). Equivalent to:: |
|
305 |
|
306 def islice(iterable, *args): |
|
307 # islice('ABCDEFG', 2) --> A B |
|
308 # islice('ABCDEFG', 2, 4) --> C D |
|
309 # islice('ABCDEFG', 2, None) --> C D E F G |
|
310 # islice('ABCDEFG', 0, None, 2) --> A C E G |
|
311 s = slice(*args) |
|
312 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1)) |
|
313 nexti = it.next() |
|
314 for i, element in enumerate(iterable): |
|
315 if i == nexti: |
|
316 yield element |
|
317 nexti = it.next() |
|
318 |
|
319 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``, |
|
320 then the step defaults to one. |
|
321 |
|
322 .. versionchanged:: 2.5 |
|
323 accept ``None`` values for default *start* and *step*. |
|
324 |
|
325 |
|
326 .. function:: izip(*iterables) |
|
327 |
|
328 Make an iterator that aggregates elements from each of the iterables. Like |
|
329 :func:`zip` except that it returns an iterator instead of a list. Used for |
|
330 lock-step iteration over several iterables at a time. Equivalent to:: |
|
331 |
|
332 def izip(*iterables): |
|
333 # izip('ABCD', 'xy') --> Ax By |
|
334 iterables = map(iter, iterables) |
|
335 while iterables: |
|
336 result = [it.next() for it in iterables] |
|
337 yield tuple(result) |
|
338 |
|
339 .. versionchanged:: 2.4 |
|
340 When no iterables are specified, returns a zero length iterator instead of |
|
341 raising a :exc:`TypeError` exception. |
|
342 |
|
343 The left-to-right evaluation order of the iterables is guaranteed. This |
|
344 makes possible an idiom for clustering a data series into n-length groups |
|
345 using ``izip(*[iter(s)]*n)``. |
|
346 |
|
347 :func:`izip` should only be used with unequal length inputs when you don't |
|
348 care about trailing, unmatched values from the longer iterables. If those |
|
349 values are important, use :func:`izip_longest` instead. |
|
350 |
|
351 |
|
352 .. function:: izip_longest(*iterables[, fillvalue]) |
|
353 |
|
354 Make an iterator that aggregates elements from each of the iterables. If the |
|
355 iterables are of uneven length, missing values are filled-in with *fillvalue*. |
|
356 Iteration continues until the longest iterable is exhausted. Equivalent to:: |
|
357 |
|
358 def izip_longest(*args, **kwds): |
|
359 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D- |
|
360 fillvalue = kwds.get('fillvalue') |
|
361 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop): |
|
362 yield counter() # yields the fillvalue, or raises IndexError |
|
363 fillers = repeat(fillvalue) |
|
364 iters = [chain(it, sentinel(), fillers) for it in args] |
|
365 try: |
|
366 for tup in izip(*iters): |
|
367 yield tup |
|
368 except IndexError: |
|
369 pass |
|
370 |
|
371 If one of the iterables is potentially infinite, then the |
|
372 :func:`izip_longest` function should be wrapped with something that limits |
|
373 the number of calls (for example :func:`islice` or :func:`takewhile`). If |
|
374 not specified, *fillvalue* defaults to ``None``. |
|
375 |
|
376 .. versionadded:: 2.6 |
|
377 |
|
378 .. function:: permutations(iterable[, r]) |
|
379 |
|
380 Return successive *r* length permutations of elements in the *iterable*. |
|
381 |
|
382 If *r* is not specified or is ``None``, then *r* defaults to the length |
|
383 of the *iterable* and all possible full-length permutations |
|
384 are generated. |
|
385 |
|
386 Permutations are emitted in lexicographic sort order. So, if the |
|
387 input *iterable* is sorted, the permutation tuples will be produced |
|
388 in sorted order. |
|
389 |
|
390 Elements are treated as unique based on their position, not on their |
|
391 value. So if the input elements are unique, there will be no repeat |
|
392 values in each permutation. |
|
393 |
|
394 Equivalent to:: |
|
395 |
|
396 def permutations(iterable, r=None): |
|
397 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC |
|
398 # permutations(range(3)) --> 012 021 102 120 201 210 |
|
399 pool = tuple(iterable) |
|
400 n = len(pool) |
|
401 r = n if r is None else r |
|
402 indices = range(n) |
|
403 cycles = range(n, n-r, -1) |
|
404 yield tuple(pool[i] for i in indices[:r]) |
|
405 while n: |
|
406 for i in reversed(range(r)): |
|
407 cycles[i] -= 1 |
|
408 if cycles[i] == 0: |
|
409 indices[i:] = indices[i+1:] + indices[i:i+1] |
|
410 cycles[i] = n - i |
|
411 else: |
|
412 j = cycles[i] |
|
413 indices[i], indices[-j] = indices[-j], indices[i] |
|
414 yield tuple(pool[i] for i in indices[:r]) |
|
415 break |
|
416 else: |
|
417 return |
|
418 |
|
419 The code for :func:`permutations` can be also expressed as a subsequence of |
|
420 :func:`product`, filtered to exclude entries with repeated elements (those |
|
421 from the same position in the input pool):: |
|
422 |
|
423 def permutations(iterable, r=None): |
|
424 pool = tuple(iterable) |
|
425 n = len(pool) |
|
426 r = n if r is None else r |
|
427 for indices in product(range(n), repeat=r): |
|
428 if len(set(indices)) == r: |
|
429 yield tuple(pool[i] for i in indices) |
|
430 |
|
431 .. versionadded:: 2.6 |
|
432 |
|
433 .. function:: product(*iterables[, repeat]) |
|
434 |
|
435 Cartesian product of input iterables. |
|
436 |
|
437 Equivalent to nested for-loops in a generator expression. For example, |
|
438 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``. |
|
439 |
|
440 The nested loops cycle like an odometer with the rightmost element advancing |
|
441 on every iteration. This pattern creates a lexicographic ordering so that if |
|
442 the input's iterables are sorted, the product tuples are emitted in sorted |
|
443 order. |
|
444 |
|
445 To compute the product of an iterable with itself, specify the number of |
|
446 repetitions with the optional *repeat* keyword argument. For example, |
|
447 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``. |
|
448 |
|
449 This function is equivalent to the following code, except that the |
|
450 actual implementation does not build up intermediate results in memory:: |
|
451 |
|
452 def product(*args, **kwds): |
|
453 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy |
|
454 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111 |
|
455 pools = map(tuple, args) * kwds.get('repeat', 1) |
|
456 result = [[]] |
|
457 for pool in pools: |
|
458 result = [x+[y] for x in result for y in pool] |
|
459 for prod in result: |
|
460 yield tuple(prod) |
|
461 |
|
462 .. versionadded:: 2.6 |
|
463 |
|
464 .. function:: repeat(object[, times]) |
|
465 |
|
466 Make an iterator that returns *object* over and over again. Runs indefinitely |
|
467 unless the *times* argument is specified. Used as argument to :func:`imap` for |
|
468 invariant function parameters. Also used with :func:`izip` to create constant |
|
469 fields in a tuple record. Equivalent to:: |
|
470 |
|
471 def repeat(object, times=None): |
|
472 # repeat(10, 3) --> 10 10 10 |
|
473 if times is None: |
|
474 while True: |
|
475 yield object |
|
476 else: |
|
477 for i in xrange(times): |
|
478 yield object |
|
479 |
|
480 |
|
481 .. function:: starmap(function, iterable) |
|
482 |
|
483 Make an iterator that computes the function using arguments obtained from |
|
484 the iterable. Used instead of :func:`imap` when argument parameters are already |
|
485 grouped in tuples from a single iterable (the data has been "pre-zipped"). The |
|
486 difference between :func:`imap` and :func:`starmap` parallels the distinction |
|
487 between ``function(a,b)`` and ``function(*c)``. Equivalent to:: |
|
488 |
|
489 def starmap(function, iterable): |
|
490 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000 |
|
491 for args in iterable: |
|
492 yield function(*args) |
|
493 |
|
494 .. versionchanged:: 2.6 |
|
495 Previously, :func:`starmap` required the function arguments to be tuples. |
|
496 Now, any iterable is allowed. |
|
497 |
|
498 .. function:: takewhile(predicate, iterable) |
|
499 |
|
500 Make an iterator that returns elements from the iterable as long as the |
|
501 predicate is true. Equivalent to:: |
|
502 |
|
503 def takewhile(predicate, iterable): |
|
504 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4 |
|
505 for x in iterable: |
|
506 if predicate(x): |
|
507 yield x |
|
508 else: |
|
509 break |
|
510 |
|
511 |
|
512 .. function:: tee(iterable[, n=2]) |
|
513 |
|
514 Return *n* independent iterators from a single iterable. The case where ``n==2`` |
|
515 is equivalent to:: |
|
516 |
|
517 def tee(iterable): |
|
518 def gen(next, data={}): |
|
519 for i in count(): |
|
520 if i in data: |
|
521 yield data.pop(i) |
|
522 else: |
|
523 data[i] = next() |
|
524 yield data[i] |
|
525 it = iter(iterable) |
|
526 return gen(it.next), gen(it.next) |
|
527 |
|
528 Note, once :func:`tee` has made a split, the original *iterable* should not be |
|
529 used anywhere else; otherwise, the *iterable* could get advanced without the tee |
|
530 objects being informed. |
|
531 |
|
532 Note, this member of the toolkit may require significant auxiliary storage |
|
533 (depending on how much temporary data needs to be stored). In general, if one |
|
534 iterator is going to use most or all of the data before the other iterator, it |
|
535 is faster to use :func:`list` instead of :func:`tee`. |
|
536 |
|
537 .. versionadded:: 2.4 |
|
538 |
|
539 |
|
540 .. _itertools-example: |
|
541 |
|
542 Examples |
|
543 -------- |
|
544 |
|
545 The following examples show common uses for each tool and demonstrate ways they |
|
546 can be combined. |
|
547 |
|
548 .. doctest:: |
|
549 |
|
550 # Show a dictionary sorted and grouped by value |
|
551 >>> from operator import itemgetter |
|
552 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3) |
|
553 >>> di = sorted(d.iteritems(), key=itemgetter(1)) |
|
554 >>> for k, g in groupby(di, key=itemgetter(1)): |
|
555 ... print k, map(itemgetter(0), g) |
|
556 ... |
|
557 1 ['a', 'c', 'e'] |
|
558 2 ['b', 'd', 'f'] |
|
559 3 ['g'] |
|
560 |
|
561 # Find runs of consecutive numbers using groupby. The key to the solution |
|
562 # is differencing with a range so that consecutive numbers all appear in |
|
563 # same group. |
|
564 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28] |
|
565 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x): |
|
566 ... print map(itemgetter(1), g) |
|
567 ... |
|
568 [1] |
|
569 [4, 5, 6] |
|
570 [10] |
|
571 [15, 16, 17, 18] |
|
572 [22] |
|
573 [25, 26, 27, 28] |
|
574 |
|
575 |
|
576 |
|
577 .. _itertools-recipes: |
|
578 |
|
579 Recipes |
|
580 ------- |
|
581 |
|
582 This section shows recipes for creating an extended toolset using the existing |
|
583 itertools as building blocks. |
|
584 |
|
585 The extended tools offer the same high performance as the underlying toolset. |
|
586 The superior memory performance is kept by processing elements one at a time |
|
587 rather than bringing the whole iterable into memory all at once. Code volume is |
|
588 kept small by linking the tools together in a functional style which helps |
|
589 eliminate temporary variables. High speed is retained by preferring |
|
590 "vectorized" building blocks over the use of for-loops and :term:`generator`\s |
|
591 which incur interpreter overhead. |
|
592 |
|
593 .. testcode:: |
|
594 |
|
595 def take(n, iterable): |
|
596 "Return first n items of the iterable as a list" |
|
597 return list(islice(iterable, n)) |
|
598 |
|
599 def enumerate(iterable, start=0): |
|
600 return izip(count(start), iterable) |
|
601 |
|
602 def tabulate(function, start=0): |
|
603 "Return function(0), function(1), ..." |
|
604 return imap(function, count(start)) |
|
605 |
|
606 def nth(iterable, n): |
|
607 "Returns the nth item or empty list" |
|
608 return list(islice(iterable, n, n+1)) |
|
609 |
|
610 def quantify(iterable, pred=bool): |
|
611 "Count how many times the predicate is true" |
|
612 return sum(imap(pred, iterable)) |
|
613 |
|
614 def padnone(iterable): |
|
615 """Returns the sequence elements and then returns None indefinitely. |
|
616 |
|
617 Useful for emulating the behavior of the built-in map() function. |
|
618 """ |
|
619 return chain(iterable, repeat(None)) |
|
620 |
|
621 def ncycles(iterable, n): |
|
622 "Returns the sequence elements n times" |
|
623 return chain.from_iterable(repeat(iterable, n)) |
|
624 |
|
625 def dotproduct(vec1, vec2): |
|
626 return sum(imap(operator.mul, vec1, vec2)) |
|
627 |
|
628 def flatten(listOfLists): |
|
629 return list(chain.from_iterable(listOfLists)) |
|
630 |
|
631 def repeatfunc(func, times=None, *args): |
|
632 """Repeat calls to func with specified arguments. |
|
633 |
|
634 Example: repeatfunc(random.random) |
|
635 """ |
|
636 if times is None: |
|
637 return starmap(func, repeat(args)) |
|
638 return starmap(func, repeat(args, times)) |
|
639 |
|
640 def pairwise(iterable): |
|
641 "s -> (s0,s1), (s1,s2), (s2, s3), ..." |
|
642 a, b = tee(iterable) |
|
643 for elem in b: |
|
644 break |
|
645 return izip(a, b) |
|
646 |
|
647 def grouper(n, iterable, fillvalue=None): |
|
648 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" |
|
649 args = [iter(iterable)] * n |
|
650 return izip_longest(fillvalue=fillvalue, *args) |
|
651 |
|
652 def roundrobin(*iterables): |
|
653 "roundrobin('ABC', 'D', 'EF') --> A D E B F C" |
|
654 # Recipe credited to George Sakkis |
|
655 pending = len(iterables) |
|
656 nexts = cycle(iter(it).next for it in iterables) |
|
657 while pending: |
|
658 try: |
|
659 for next in nexts: |
|
660 yield next() |
|
661 except StopIteration: |
|
662 pending -= 1 |
|
663 nexts = cycle(islice(nexts, pending)) |
|
664 |
|
665 def powerset(iterable): |
|
666 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])" |
|
667 # Recipe credited to Eric Raymond |
|
668 pairs = [(2**i, x) for i, x in enumerate(iterable)] |
|
669 for n in xrange(2**len(pairs)): |
|
670 yield set(x for m, x in pairs if m&n) |
|
671 |
|
672 def compress(data, selectors): |
|
673 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F" |
|
674 return (d for d, s in izip(data, selectors) if s) |
|
675 |
|
676 def combinations_with_replacement(iterable, r): |
|
677 "combinations_with_replacement('ABC', 3) --> AA AB AC BB BC CC" |
|
678 pool = tuple(iterable) |
|
679 n = len(pool) |
|
680 indices = [0] * r |
|
681 yield tuple(pool[i] for i in indices) |
|
682 while 1: |
|
683 for i in reversed(range(r)): |
|
684 if indices[i] != n - 1: |
|
685 break |
|
686 else: |
|
687 return |
|
688 indices[i:] = [indices[i] + 1] * (r - i) |
|
689 yield tuple(pool[i] for i in indices) |