Skip to main content
  1. Readings/
  2. Books/
  3. Fluent Python: Clear, Concise, and Effective Programming – Luciano Ramalho/

Chapter 17. Iterators, Generators, and Classic Coroutines

··6434 words·31 mins
  • iterator design pattern is builtin to python.

  • Every standard collection in Python is iterable. An iterable is an object that provides an iterator,

What’s New in This Chapter #

A Sequence of Words #

Why Sequences Are Iterable: The iter Function #

  1. the dispatch flow is like so:

    1. need to iterate on x \(\implies\) calls iter(x) builtin
    2. try __iter__ implementation
    3. elif try __getitem__, if present then fetch items by index, start from 0-index
    4. fail and raise TypeError
  2. all Python sequences are iterable:

    by definition, they all implement __getitem__ (especially for backward compatibility).

    std sequences also implement __iter__ and custom ones should also have this

  3. this is an extreme form of duck typing:

    an object is considered iterable not only when it implements the special method __iter__, but also when it implements __getitem__

  4. goose typing approach, it’s just checking the existence of __iter__ method. No registration needed because abc.Iterable impelements the __subclasshook__

  5. the ducktyped approach to typechecking for iterable is better than the goose-typing approach

Using iter with a Callable #

  • when used with a callable, second arg is a sentinel value for detecting the stop iteration.

    sentinel value will never really be yielded because that’s the sentinel.

  • iterators may get exhausted.

  • the callable given to iter() MUST NOT require arguments. If necessary, remember to convert it to a partial function (where the arguments are pre-binded) so that it’s effectively a nullary function.

Iterables Versus Iterators #

  • python obtains iterators from iterables

  • any obj for which the iter() builtin can get an iterator is an iterable

    • either gets it from __iter__ or indirectly from __getitem__
  • an iterator raises a StopIteration when there are no further items. there’s no way to check for empty other than this, and there’s no way to reset an iterator other than to create it again.

  • __issubclasshook__ implementation within Iterator:

    1
    2
    3
    4
    5
    
      @classmethod
      def __subclasshook__(cls, C):
              if cls is Iterator:
                      return _check_methods(C, '__iter__', '__next__')
              return NotImplemented
    

    the _check_methods is provided by the abc module

    it traverses the MRO for the class and checks if methods implemented or not

    MISCONCEPTION: virtual subclassing doesn’t ONLY need to be explicitly registered. The use of __issubclasshook__ that relies on __checkmethods__ is an example of implicit virtual subclassing

  • easiest way to typecheck for iterator is to do goosetypecheck: isinstance(x, abc,Iterator)

Sentence Classes with iter #

  • iterators are supposed to implement both __next__ and __iter__. the iter dunder method is so that they work well in places that expect and iterable.

Sentence Take #2: A Classic Iterator #

  • this is just a didatic example, uses a custom class that keeps track of a cursor for the next idx to present value from and if out of bounds, marks as stop iter.

Don’t Make the Iterable an Iterator for Itself #

  • iterators are also iterable (because they have the __iter__ method that returns self) but iterables are NOT iterators (they can create iterators)

  • common source of error is to confuse the two.

    common antipattern:

    to implement __next__ for an iterable so that an iterable is also an iterator over itself.

  • so a proper implementation of the pattern requires each call to iter(my_iterable) to create a new, independent, iterator.

Sentence Take #3: A Generator Function #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
"""
Sentence: iterate over words using a generator function
"""

# tag::SENTENCE_GEN[]
import re
import reprlib

RE_WORD = re.compile(r'\w+')


class Sentence:

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        for word in self.words:  # <1>
            yield word  # <2>
        # <3>

# done! <4>

# end::SENTENCE_GEN[]
  • pythonic way is to use a generator instead of a custom class that acts as the iterator

  • here, __iter__ is a generator function

  • a generator function doesn’t raise StopIteration, it just exits when it gets exhausted

How a Generator Works #

  • a generator function is a generator factory

    it is a function, when called, returns a generator object

    generator function generates generator objects

    generator function and generator objects are not the same

  • not necessary to have just a single yield (typically within a loop construct), we can have as many yield s as we like in our generator function

  • on each next() applied to the generator object, we’ll just end up continuing the control flow until the next yield statement

  • the fallthrough at the end of a generator function is for the generator object to raise StopIteration

    the consumer of the generator object may handle things cleanly

    When the generator function runs to the end, the generator object raises StopIteration. The for loop machinery catches that exception, and the loop terminates cleanly.

  • Language:

    • functions “return” values, generators “yield” values

      generator functions return generator objects

Lazy Sentences #

Sentence Take #4: Lazy Generator #

  • we know that the findall method for the regex was being eager so we use the lazy version: re.finditer. This returns a generator yielding re.MatchObject instances on demand \(\implies\) it’s not lazy.

    finditer builds an iterator over the matches of RE_WORD on self.text, yielding MatchObject instances.

    code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    
      """
      Sentence: iterate over words using a generator function
      """
    
      # tag::SENTENCE_GEN2[]
      import re
      import reprlib
    
      RE_WORD = re.compile(r'\w+')
    
    
      class Sentence:
    
          def __init__(self, text):
              self.text = text  # <1>
    
          def __repr__(self):
              return f'Sentence({reprlib.repr(self.text)})'
    
          def __iter__(self):
              for match in RE_WORD.finditer(self.text):  # <2>
                  yield match.group()  # <3>
    
      # end::SENTENCE_GEN2[]
    

Sentence Take #5: Lazy Generator Expression #

  • intent here is to replace generator functions with generator expressions. should be seen as syntactic sugar.

  • we can write generator expressions using generator objects that do not directly consume the generator objects, thereby preserving the lazy behaviour

  • code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    
      """
      Sentence: iterate over words using a generator expression
      """
    
      # tag::SENTENCE_GENEXP[]
      import re
      import reprlib
    
      RE_WORD = re.compile(r'\w+')
    
    
      class Sentence:
    
          def __init__(self, text):
              self.text = text
    
          def __repr__(self):
              return f'Sentence({reprlib.repr(self.text)})'
    
          def __iter__(self):
              return (match.group() for match in RE_WORD.finditer(self.text))
      # end::SENTENCE_GENEXP[]
    
    
      def main():
          import sys
          import warnings
          try:
              filename = sys.argv[1]
              word_number = int(sys.argv[2])
          except (IndexError, ValueError):
              print(f'Usage: {sys.argv[0]} <file-name> <word-number>')
              sys.exit(2)  # command line usage error
          with open(filename, 'rt', encoding='utf-8') as text_file:
              s = Sentence(text_file.read())
          for n, word in enumerate(s, 1):
              if n == word_number:
                  print(word)
                  break
          else:
              warnings.warn(f'last word is #{n}, {word!r}')
    
      if __name__ == '__main__':
          main()
    
    • the __iter__ method here is no longer a generator function (since it has no yield), it uses a generator expression to build a generator object and returns it

      same outcome though, both cases return a generator object

When to Use Generator Expressions #

  • should be seen as a syntactic shortcut to create a generator without defining and calling a function.

  • syntax stuff:

    • if we’re passing in a genexpr as the only argument to a function, we can omit the surrounding () and it will work. This doesn’t work if there’s more than one argument that we’re supplying though.
  • compared with generator functions:

    • generator functions can be seen as coroutines even, supports complex logic with multiple statements

    • should use generator functions when the genexpr looks too complex.

Contrasting Iterators and Generators #

  • iterators:

    • anything implementing __next__ method
    • produce data for client code consumption:
      • consumed via drivers such as for loops
      • consumed via the explicit calling of next(it)
    • practicall, most iterators in python are Generators.
  • Generators

    • an iterator that the python compiler builds

    • ways to create a generator:

      1. implement a generator function, with a yield keyword. this is a factory of generator objects

      2. use a generator expression to build a generator object

    • it’s the generator objects that provide __next__ so that they are iterators. The generators (generator functions) don’t need to implement __next__

    • we can have async generators

An Arithmetic Progression Generator #

  • TRICK: we can see the range function as a built in that generates a bounded arithmetic progression of integers!

  • TRICK: python 3 doesn’t have an explicit type coersion method, but we can work around this:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    
      def __iter__(self):
              result_type = type(self.begin + self.step)
              # NOTE: by keeping the target result type, we can then coerce it like so:
              result = result_type(self.begin)
              forever = self.end is None
              index = 0
    
              while forever or result < self.end:
                      yield result
                      index += 1
                      result = self.begin + self.step * index
    
  • if the whole point of a class is to build a generator by implementing __iter__, we can replace the class with a generator function. A generator function is, after all, a generator factory.

    code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    
      """
      Arithmetic progression generator function::
    
          >>> ap = aritprog_gen(1, .5, 3)
          >>> list(ap)
          [1.0, 1.5, 2.0, 2.5]
          >>> ap = aritprog_gen(0, 1/3, 1)
          >>> list(ap)
          [0.0, 0.3333333333333333, 0.6666666666666666]
          >>> from fractions import Fraction
          >>> ap = aritprog_gen(0, Fraction(1, 3), 1)
          >>> list(ap)
          [Fraction(0, 1), Fraction(1, 3), Fraction(2, 3)]
          >>> from decimal import Decimal
          >>> ap = aritprog_gen(0, Decimal('.1'), .3)
          >>> list(ap)
          [Decimal('0'), Decimal('0.1'), Decimal('0.2')]
    
      """
    
    
      # tag::ARITPROG_GENFUNC[]
      def aritprog_gen(begin, step, end=None):
          result = type(begin + step)(begin)
          forever = end is None
          index = 0
          while forever or result < end:
              yield result
              index += 1
              result = begin + step * index
      # end::ARITPROG_GENFUNC[]
    

Arithmetic Progression with itertools #

  • ready to use generators in itertools, which we can combine

  • some useful ones:

    1. itertools.count is infinite generator, accepts a start and a step

    2. itertools.takewhile function: it returns a generator that consumes another generator and stops when a given predicate evaluates to False

      example: gen = itertools.takewhile(lambda n: n < 3, itertools.count(1, .5))

  • code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    
      # tag::ARITPROG_ITERTOOLS[]
      import itertools
    
      def aritprog_gen(begin, step, end=None):
          first = type(begin + step)(begin)
          ap_gen = itertools.count(first, step)
          if end is None:
              return ap_gen
          return itertools.takewhile(lambda n: n < end, ap_gen)
      # end::ARITPROG_ITERTOOLS[]
    

    NOTE: aritprog_gen is not a generator function because it has no yield in its body, it still returns a generator though, like a generator function does.

    None
    
  • when implementing generators, know what is available in the standard library, otherwise there’s a good chance you’ll reinvent the wheel.

Generator Functions in the Standard Library #

This section focuses on general-purpose functions that take arbitrary iterables as arguments and return generators that yield selected, computed, or rearranged items.

Some groups of standard generators:

  1. Filtering generator functions:

    they yield a subset of items produced by the input iterable, without changing the items themselves.

  2. Mapping generators: (in the context of map functions, not map objects)

    they yield items computed from each individual item in the input iterable—or iterables,

    • starmap is cool. it does an unpacking from an iterator that yields tuples e.g. this gives us a running average: list(itertools.starmap(lambda a, b: b / a, enumerate(itertools.accumulate(sample), 1)))
           The **`itertools.starmap` function** in Python is a specialized iterator that applies a given function to elements from an iterable, where each element is itself an iterable (typically a tuple) and is unpacked as arguments to the function.
      
           ### Detailed Explanation
      
           - `starmap(function, iterable)` is similar to the built-in `map()` function, but while `map()` passes each element as a whole argument to the function, **`starmap` unpacks each element (tuple) as positional arguments to the function**.
      ​     - This means if you have an iterable of tuples like `[(a1, b1), (a2, b2), ...]`, `starmap` calls `function(a1, b1)`, `function(a2, b2)`, etc.
      ​     - It returns an iterator that yields the results of these function calls lazily, i.e., items are computed only as you iterate.
      
           ### Mental Model
      
           - Think of `starmap` as a clever iterator transform that "unzips" or unpacks arguments for you when you have multiple parameters.
      ​     - It is particularly useful when working with functions or operations that expect multiple positional arguments but you have your data already grouped as tuples in your iterable.
      ​     - This differs from `map`, which treats each element as a single argument.
      
           ### Prototype of `starmap` behavior (simplified)
      
           ```python
           def starmap(function, iterable):
               for args in iterable:
                   yield function(*args)  # unpack each tuple as arguments
           ```
      
           ### Practical Example
      
           ```python
           from itertools import starmap
      
           def multiply(x, y):
               return x * y
      
           pairs = [(2, 3), (4, 5), (6, 7)]
           result = starmap(multiply, pairs)
      
           print(list(result))  # Output: [6, 20, 42]
           ```
      
           In this example, `multiply` requires two arguments; `pairs` contains tuples with two elements each, and `starmap` calls `multiply` with unpacked tuple elements.
      
           ### Comparison with `map`
      
           - Using `map(pow, [(2, 3), (4, 5)])` would fail because `pow` expects two separate arguments, but `map` passes the entire tuple as one argument.
      ​     - Using `starmap(pow, [(2, 3), (4, 5)])` works because it unpacks the tuples as separate arguments automatically.
      
           ### Use Cases Beyond Simple Functions
      
           - It is often leveraged to apply functions like `pow`, arithmetic operators, or user-defined functions that take multiple arguments.
      ​     - Useful in multiprocessing scenarios (e.g., `multiprocessing.Pool.starmap`) for applying functions with multiple inputs concurrently.
      
           ### Summary Table
      
           | Aspect                    | Description                                                         |
           |---------------------------|---------------------------------------------------------------------|
           | Function signature        | `itertools.starmap(function, iterable_of_arg_tuples)`              |
           | Functional behavior       | Applies function as `function(*args)` for each tuple in iterable    |
           | Returns                   | An iterator yielding results lazily                                |
           | Difference from `map`     | `map` passes each element as-is; `starmap` unpacks tuple arguments  |
           | Use case                  | Applying multi-argument functions over an iterable of argument tuples|
      
           ### References
      
           - Python official docs for itertools: `starmap` applies a function to unpacked arguments from tuples in an iterable.
      ​     - Tutorialspoint, GeeksforGeeks, and Educative.io provide practical examples demonstrating the use and difference from `map`.
      ​     - Multiprocessing's `Pool.starmap()` uses exactly the same concept to map multi-argument functions in parallel.
      
           This understanding helps senior engineers grasp how `starmap` elegantly bridges the gap between iterable data structures and multi-argument function applications in Python’s iterator toolkit.
      
           [1] https://www.tutorialspoint.com/python/python_itertools_starmap_function.htm
           [2] https://www.geeksforgeeks.org/python/python-itertools-starmap/
           [3] https://www.educative.io/answers/what-is-the-itertoolsstarmap-method-in-python
           [4] https://www.mybluelinux.com/python-map-and-starmap-functions/
           [5] https://superfastpython.com/multiprocessing-pool-starmap/
           [6] https://docs.python.org/3/library/multiprocessing.html
           [7] https://indhumathychelliah.com/2020/09/14/exploring-map-vs-starmap-in-python/
           [8] https://stackoverflow.com/questions/56672348/applying-the-pool-starmap-function-with-multiple-arguments-on-a-dict-which-are
           [9] https://www.youtube.com/watch?v=aUUJRF6Zako
      
  3. Merging Generators: yield items from multiple input iterables

    • chain.from_iterable: It’s almost like flattening.
  4. Generator functions that expand each input into multiple output items:

    • pairwise is interesting: each item in the input, pairwise yields a 2-tuple with that item and the next — if there is a next item.

      list(itertools.pairwise(range(7)))

  5. TRICK: Combinatorics Generators see the elaboration here:

      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    
       The **combinatorics generators in the `itertools` module** of Python are a suite of powerful, memory-efficient iterator-based functions designed to generate combinatorial collections such as permutations, combinations, and Cartesian products systematically without building them fully in memory. These functions are foundational for algorithmic tasks involving combinatorial enumeration, sampling, or search.
    
       Here is a detailed outline tailored for a senior engineer who values mental models, rigor, and first principles:
    
       ***
    
       ### 1. Overview of Combinatorics Generators in `itertools`
    
       Pythons `itertools` module offers **four primary combinatoric iterators** that generate combinatorial constructs lazily:
    
       | Iterator Name                  | Purpose                                     | Key Characteristics                                      |
       |-------------------------------|---------------------------------------------|----------------------------------------------------------|
       | `product()`                   | Cartesian product of input iterables      | Generates tuples combining every element with every other (with optional repetition) |
       | `permutations()`               | All possible orderings (permutations)      | Generates all possible ordered arrangements of a specified length |
       | `combinations()`               | Combinations without replacement            | Generates all possible selections of a specified length without regard to order |
       | `combinations_with_replacement()` | Combinations allowing repeated elements     | Like combinations but allows repeated elements in output  |
    
       ***
    
       ### 2. Detailed Description with Mental Models
    
       #### a. `itertools.product(*iterables, repeat=1)`
    
       - **Conceptual model:** The Cartesian product can be thought of as the "all pairs/all tuples" construction, where you combine every element of iterable 1 with every element of iterable 2, and so forth.
       - **Use case:** Explores all possible selections when repetition and order matter.
       - **Implementation detail:** Produces tuples where each position corresponds to one iterable element. The `repeat` argument simulates self cartesian products.
       - **Example:**
    
         ```python
         from itertools import product
    
         list(product([1, 2], repeat=2))
         # Output: [(1, 1), (1, 2), (2, 1), (2, 2)]
         ```
    
       - **Scaling note:** The output size grows multiplicatively  caution with large inputs.
    
       #### b. `itertools.permutations(iterable, r=None)`
    
       - **Conceptual model:** All possible orderings of `r` distinct elements from the iterable, where order matters.
       - **Use case:** Problems requiring permutations without replacement.
       - **Implementation:** Yields tuples of length `r` (default `r` equals length of iterable).
       - **Example:**
    
         ```python
         from itertools import permutations
    
         list(permutations('ABC', 2))
         # Output: [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]
         ```
    
       - **Key mental model:** Unlike combinations, `'AB'` and `'BA'` are distinct.
    
       #### c. `itertools.combinations(iterable, r)`
    
       - **Conceptual model:** Select `r` elements from iterable, ignoring order and disallowing repetitions.
       - **Use case:** Choosing subsets or unique groupings.
       - **Implementation:** Yields sorted tuples of length `r`.
       - **Example:**
    
         ```python
         from itertools import combinations
    
         list(combinations('ABC', 2))
         # Output: [('A', 'B'), ('A', 'C'), ('B', 'C')]
         ```
    
       - **Mental model:** For `['A', 'B', 'C']` picking 2 is like choosing pairs regardless of arrangement.
    
       #### d. `itertools.combinations_with_replacement(iterable, r)`
    
       - **Conceptual model:** Like combinations but elements can appear multiple times.
       - **Use case:** Selecting combinations where repeats are allowed.
       - **Implementation:** Yields sorted tuples where elements can recur.
       - **Example:**
    
         ```python
         from itertools import combinations_with_replacement
    
         list(combinations_with_replacement('AB', 2))
         # Output: [('A', 'A'), ('A', 'B'), ('B', 'B')]
         ```
    
       ***
    
       ### 3. Functional and Performance Notes
    
       - All these combinatoric iterators **return generator objects**, supporting lazy evaluation to handle potentially large combinatorial spaces without memory blowup.
       - Outputs are tuples representing fixed-size sequences or selections.
       - The generation order and structure adhere to mathematical combinatorics rules (e.g., lexicographic ordering for combinations).
       - Be mindful of the combinatorial explosion, where output size can grow factorially or exponentially with input size.
    
       ***
    
       ### 4. Related Tools and Concepts in `functools` and Others
    
       - While `functools` does not include combinatorics generators directly, its role complements iterator tools by providing **function composition**, **partial application** (`partial`), and caching mechanisms (`lru_cache`) common in higher-order functional programming.
       - For complex pipeline building involving combinatorics generators, these tools enhance composability and performance in functional-style codebases.
    
       ***
    
       ### Mental Model Summary Table
    
       | Function                           | Category           | Usage Mental Model                              | Output                         | Notes                          |
       |----------------------------------|--------------------|------------------------------------------------|-------------------------------|--------------------------------|
       | `product(*iterables, repeat=1)`  | Cartesian product  | Cross-combine elements, repeat allows self-product | Tuples representing Cartesian product elements | Size = $$\prod |iterables_i|$$ or $$|iterable|^{repeat}$$ |
       | `permutations(iterable, r=None)` | Permutations       | Order-sensitive arrangements of length r       | Tuples of length r             | Number of permutations = n!/(n-r)! |
       | `combinations(iterable, r)`       | Combinations       | Unordered selection of r items without replacement | Tuples of length r             | Number = C(n, r)               |
       | `combinations_with_replacement(iterable, r)` | Combinations with replacement | Unordered selection allowing duplicates        | Tuples of length r             | Number = C(n+r-1, r)           |
    
       ***
    
       ### References
    
       - Python official documentation  `itertools` [combinatoric iterators section](https://docs.python.org/3/library/itertools.html#itertools.combinations)
       - GeeksforGeeks "Combinatoric Iterators in Python" (2025)[1][2]
       - Blog posts and overviews such as Juha-Matti Santalas "Combinatoric iterators from itertools" (2024)[3]
       - Real Python and Towards Data Science tutorials on `itertools`
    
       This combinatorics toolkit in `itertools` forms the backbone for algorithmic exploration in exhaustive, sampled, or random combinatorial problems while preserving efficiency and clarity in Python's iterator model. Let me know if you'd like me to dive into usage patterns, performance characteristics, or idiomatic compositions with these combinatorics generators.
    
       [1] https://www.geeksforgeeks.org/python-itertools/
       [2] https://www.geeksforgeeks.org/combinatoric-iterators-in-python/
       [3] https://hamatti.org/posts/combinatoric-iterators-from-itertools/
       [4] https://www.pythonlikeyoumeanit.com/Module2_EssentialsOfPython/Itertools.html
       [5] https://towardsdatascience.com/a-guide-to-python-itertools-like-no-other-454da1ddd5b8/
       [6] https://realpython.com/python-itertools/
       [7] https://mathspp.com/blog/module-itertools-overview
       [8] https://docs.python.org/3/library/itertools.html
       [9] https://labex.io/tutorials/python-how-to-use-itertools-combinations-in-python-398083
       [10] https://arjancodes.com/blog/python-itertools-module-tutorial-for-efficient-data-handling/
    
  6. Rearranger Generators: yield all items in the input iterables, in some rearranged ways

    They all accept at most one input variable.

    • itertools.groupby, itertool.tee return multiple generators

      • GOTCHA: itertools.groupby assumes that the input iterable is sorted by the grouping criterion, or at least that the items are clustered by that criterion — even if not completely sorted.

        e.g. use case: you can sort the datetime objects chronologically, then groupby weekday to get a group of Monday data, followed by Tuesday data, etc., and then by Monday (of the next week) again, and so on.

    • itertools.tee similar to unix tee, gives us multiple generators to consume the yielded values independently.

      which has a unique behavior: it yields multiple generators from a single input iterable, each yielding every item from the input. Those generators can be consumed independently,

  • reversed only works with sequences

Iterable Reducing Functions #

  • given an iterable, they return a single result \(\implies\) “reducing”/ “folding” / “accumulating” functions.

    Naturally, they have to work with bounded iterables, won’t work with infinite iterables.

  • all and any have the ability to short-circuit!

Subgenerators with yield from #

  • objective is to let a generator delegate to a subgenerator
  • uses yield from

Reinventing chain #

Here’s the implementation without yield from

1
2
3
4
5
6
7
8
9
def chain(*iterables):
        for it in iterables:
                for i in it:
                        yield i

s = 'ABC'
r = range(3)

return list(chain(s, r))

here’s how we can implement itertools.chain using yield from

ABC012
1
2
3
4
5
6
7
8
def chain(*iterables):
        for it in iterables:
                yield from it

s = 'ABC'
r = range(3)

return list(chain(s, r))

Traversing a Tree #

Step 2: using a subgenerator for the subtrees #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def tree(cls):
    yield cls.__name__, 0
    yield from sub_tree(cls)              # <1> here's the delegation from tree to sub_tree. here, the tree generator is suspended, and sub_tree takes over yielding values


def sub_tree(cls):
    for sub_cls in cls.__subclasses__():
        yield sub_cls.__name__, 1         # <2>


def display(cls):
    for cls_name, level in tree(cls):     # <3>
        indent = ' ' * 4 * level
        print(f'{indent}{cls_name}')


if __name__ == '__main__':
    display(BaseException)
  • the delegation from generator to sub-generator is interesting

    here, the tree generator is suspended, and sub_tree takes over yielding values

  • we soon observe the following pattern:

    We do a for loop to get the subclasses of level N. Each time around the loop, we yield a subclass of level N, then start another for loop to visit level N+1.

Step 5 #

  • we use the pattern seen before and call the same generator function again as a subgenerator:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    
      def tree(cls):
          yield cls.__name__, 0
          yield from sub_tree(cls, 1)
    
    
      def sub_tree(cls, level):
          for sub_cls in cls.__subclasses__():
              yield sub_cls.__name__, level
              yield from sub_tree(sub_cls, level+1)
    
    
      def display(cls):
          for cls_name, level in tree(cls):
              indent = ' ' * 4 * level
              print(f'{indent}{cls_name}')
    
    
      if __name__ == '__main__':
          display(BaseException)
    

    This is limited only by Python’s recursion limit. The default limit allows 1,000 pending functions.

    This also has an implicit base case:

    sub_tree has no if, but there is an implicit conditional in the for loop: if cls.__subclasses__() returns an empty list, the body of the loop is not executed, therefore no recursive call happens. The base case is when the cls class has no subclasses. In that case, sub_tree yields nothing. It just returns.

Step 6: merge into a single generator #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def tree(cls, level=0):
    yield cls.__name__, level
    for sub_cls in cls.__subclasses__():
        yield from tree(sub_cls, level+1)


def display(cls):
    for cls_name, level in tree(cls):
        indent = ' ' * 4 * level
        print(f'{indent}{cls_name}')


if __name__ == '__main__':
    display(BaseException)

yield from connects the subgenerator directly to the client code, bypassing the delegating generator. That connection becomes really important when generators are used as coroutines and not only produce but also consume values from the client code,

Generic Iterable Types #

  • Mypy, reveals that the Iterator type is really a simplified special case of the Generator type.
  • Iterator[T] is a shortcut for Generator[T, None, None]. Both annotations mean “a generator that yields items of type T, but that does not consume or return values.”
  • Generators can consume and return values \(\implies\) they are classic coroutines

Classic Coroutines via Enhanced Generators #

  • “generators that can consume and return values”

  • these are not supported by asyncio

  • the modern, native coroutines are just called “coroutines” now.

  • 2 ways to typehint generators:

    Underlying C implementation is the same, they are just USED differently.

    1. as an iterator: readings: Iterator[float]

      Bound to an iterator / generator object that yields float items

    2. as a coroutine: sim_taxi: Generator[Event, float, int]

      The `sim_taxi` variable can be bound to a coroutine representing a taxi cab in a discrete event simulation. It yields events, receives `float` timestamps, and returns the number of trips made during the simulation

  • The type is named Generator, when in fact it describes the API of a generator object intended to be used as a coroutine, while generators are more often used as simple iterators.

    Generator[YieldType, SendType, ReturnType]

    Generator type has the same type parameters as typing.Coroutine: Coroutine[YieldType, SendType, ReturnType] (deprecated in favour of collections.abc.Coroutine) which is to annotate only native co-routines, not classic coroutines.

  • Some guidelines to avoid confusion:

    • Generators produce data for iteration
    • Coroutines are consumers of data
    • To keep your brain from exploding, don’t mix the two concepts together
    • Coroutines are not related to iteration
    • Note: There is a use of having `yield` produce a value in a coroutine, but it’s not tied to iteration.

Example: Coroutine to Compute a Running Average #

Old example of running average using closures. This is a higher order function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def make_averager():
        count = 0
        total = 0

        def averager(new_value):
                nonlocal count, total
                count += 1
                total += new_value
                return total / count
        return averager

yield statement here suspends the coroutine, yields a result to the client, and — later — gets a value sent by the caller to the coroutine, starting another iteration of the infinite loop.

The coroutine can keep internal state without needing any instance attrs or closures. They keep local state between activations \(\implies\) attractive replacement for callbacks in async programming

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
"""
A coroutine to compute a running average

# tag::CORO_AVERAGER_TEST[]
    >>> coro_avg = averager()  # <1>
    >>> next(coro_avg)  # <2>
    0.0
    >>> coro_avg.send(10)  # <3>
    10.0
    >>> coro_avg.send(30)
    20.0
    >>> coro_avg.send(5)
    15.0

# end::CORO_AVERAGER_TEST[]
# tag::CORO_AVERAGER_TEST_CONT[]

    >>> coro_avg.send(20)  # <1>
    16.25
    >>> coro_avg.close()  # <2>
    >>> coro_avg.close()  # <3>
    >>> coro_avg.send(5)  # <4>
    Traceback (most recent call last):
      ...
    StopIteration

# end::CORO_AVERAGER_TEST_CONT[]

"""

# tag::CORO_AVERAGER[]
from collections.abc import Generator

def averager() -> Generator[float, float, None]:  # <1> yields float, accepts float, nothing useful returned
    total = 0.0
    count = 0
    average = 0.0
    while True:  # <2> will keep accepting as long as there are values sent to this coroutine
        term = yield average  # <3> =yield= statement here suspends the coroutine, yields a result to the client, and — later — gets a value sent by the caller to the coroutine, starting another iteration of the infinite loop.
        total += term
        count += 1
        average = total/count
# end::CORO_AVERAGER[]

Priming/Starting the Coroutine #

We can do an initial next(my_coroutine)

OR, we can send(None) to start it off. Only None works here because a coroutine can’t accept a sent value, unless it is suspended at a yield line.

Multiple activations #

  • After each activation, the coroutine is suspended precisely at the yield keyword, waiting for a value to be sent.

  • coro_avg.send(10): yield expression resolves to the value 10, assigning it to the term variable. The rest of the loop updates the total, count, and average variables. The next iteration in the while loop yields the average, and the coroutine is again suspended at the yield keyword.

  • i notice that there’s 2 states to the co-routine: active and suspended.

Terminating a coroutine #

  • can just stop referring to it and the coroutine can be garbage collected

  • for explicit termination, we can call coro_avg.close()

  • .close() method raises GeneratorExit at the suspended yield expression. If not handled in the coroutine function, the exception terminates it. GeneratorExit is caught by the generator object that wraps the coroutine—that’s

  • calling close on a closed coroutine does nothing, but sending to a closed coroutine raises StopIteration

Returning a Value from a Coroutine #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
"""
A coroutine to compute a running average.

Testing ``averager2`` by itself::

# tag::RETURNING_AVERAGER_DEMO_1[]

    >>> coro_avg = averager2()
    >>> next(coro_avg)
    >>> coro_avg.send(10)  # <1>
    >>> coro_avg.send(30)
    >>> coro_avg.send(6.5)
    >>> coro_avg.close()  # <2>

# end::RETURNING_AVERAGER_DEMO_1[]

Catching `StopIteration` to extract the value returned by
the coroutine::

# tag::RETURNING_AVERAGER_DEMO_2[]

    >>> coro_avg = averager2()
    >>> next(coro_avg)
    >>> coro_avg.send(10)
    >>> coro_avg.send(30)
    >>> coro_avg.send(6.5)
    >>> try:
    ...     coro_avg.send(STOP)  # <1>
    ... except StopIteration as exc:
    ...     result = exc.value  # <2>
    ...
    >>> result  # <3>
    Result(count=3, average=15.5)

# end::RETURNING_AVERAGER_DEMO_2[]

Using `yield from`:


# tag::RETURNING_AVERAGER_DEMO_3[]

NOTE: this uses a delegating generator

    >>> def compute():
    ...     res = yield from averager2(True)  # <1>
    ...     print('computed:', res)  # <2>
    ...     return res  # <3>
    ...
    >>> comp = compute()  # <4>
    >>> for v in [None, 10, 20, 30, STOP]:  # <5>
    ...     try:
    ...         comp.send(v)  # <6>
    ...     except StopIteration as exc:  # <7> Have to capture the StopIteration, else the GeneratorExit exception is raised at the yield line in the coroutine, so the return statement is never reached.
    ...         result = exc.value
    received: 10
    received: 20
    received: 30
    received: <Sentinel>
    computed: Result(count=3, average=20.0)
    >>> result  # <8>
    Result(count=3, average=20.0)

# end::RETURNING_AVERAGER_DEMO_3[]
"""

# tag::RETURNING_AVERAGER_TOP[]
from collections.abc import Generator
from typing import Union, NamedTuple

class Result(NamedTuple):  # <1>
    count: int  # type: ignore  # <2>
    average: float

class Sentinel:  # <3>
    def __repr__(self):
        return f'<Sentinel>'

STOP = Sentinel()  # <4>

SendType = Union[float, Sentinel]  # <5> modern python, write it as SendType: TypeAlias = float | Sentinel, or directly use the =|= union in the generator SendType type param

# end::RETURNING_AVERAGER_TOP[]
# tag::RETURNING_AVERAGER[]
def averager2(verbose: bool = False) -> Generator[None, SendType, Result]:  # <1> None data yielded, returns Result type, which is a named tuple (subclass of tuple)
    total = 0.0
    count = 0
    average = 0.0
    while True:
        term = yield  # <2> this consumes data (when resuming, assigned to variable named "term")
        if verbose:
            print('received:', term)
        if isinstance(term, Sentinel):  # <3> if received the Sentinel, break from infinite loop.
            break
        total += term  # <4>
        count += 1
        average = total / count
    return Result(count, average)  # <5> reachable only if Sentinel is sent to the coroutine

# end::RETURNING_AVERAGER[]
  • coroutines can consume data, (based on the SendType type param). So that’s why it makes sense to use yield directly.

  • Calling .close() in this coroutine makes it stop but does not return a result, because the GeneratorExit exception is raised at the yield line in the coroutine, so the return statement is never reached.

    Instead, we do:

    1
    2
    3
    4
    5
    
              try:
                      coro_avg.send(STOP)
              except StopIteration as exc:
                      result = exc.value
      return result
    
    1. STOP sentinel makes the coroutine break from the loop and return a Result The generator object that wraps the coroutine then raises StopIteration.

    2. The instance of StopIteration has a value attribute bound to the value of the return statement that terminated the coroutine.

Generic Type Hints for Classic Coroutines #

  • stick to my existing mental model:

    • producers can produce more specific types, i.e. can be narrower \(\implies\) they are covariant \(\implies\) they are output types

      so this corresponds to the YieldType and ReturnType

                         float :&gt; int
      

      Generator[float, Any, float] :> Generator[int, Any, int]

      Corresponds to variance rule of thumb 1:

      If a formal type parameter defines a type for data that comes out of the object, it can be covariant.

    • consumers can accept more generic types i.e. they can accept wider \(\implies\) they are contravariant \(\implies\) they are input types

      so this corresponds to SendType

                       float :&gt; int
      

      Generator[Any, float, Any] <: Generator[Any, int, Any]

      Corresponds to variance rule of thumb 2:

      If a formal type parameter defines a type for data that goes into the object after its initial construction, it can be contravariant.

  • elaboration on why my mental model is correct:

      Your **mental model about generic type hints for classic coroutines in Python is fundamentally correct** and aligns well with the design principles of variance in type systems and how they apply to the parameters of the `Generator` or `Coroutine` generic types.
    
      To unpack this rigorously:
    
      ***
    
      ### Context: Generic Type Parameters for Classic Coroutines
    
      Classic (generator-based) coroutines and generators can be annotated using the generic `Generator` type hint which has three type parameters:
    
      $$
      Generator[YieldType, SendType, ReturnType]
      $$
    
      - **YieldType** — the type of values **produced** (yielded) by the generator/coroutine.
    ​  - **SendType** — the type of values the generator/coroutine **accepts** via `.send()` (like parameters you can pass in).
    ​  - **ReturnType** — the type of the value returned upon completion (from the `return` statement inside the generator).
    
      ***
    
      ### Your mental model restated
    
      - **Producers produce more specific types (narrower types).** This corresponds to output types, and hence these are **covariant types**.
        This includes `YieldType` (values yielded) and `ReturnType` (final returned value).
    
      - **Consumers accept more generic types (wider types).** These correspond to input types, hence they are **contravariant types**.
        This corresponds to `SendType` — the values you send **into** the coroutine, which can be more general.
    
      ***
    
      ### What does this mean in terms of type variance?
    
      - **Covariance** means that if $$A$$ is a subtype of $$B$$, then a producer of $$A$$ is also a producer of $$B$$ (safe to substitute). For example, if you expect `YieldType` to be a `Animal`, then yielding `Dog` (subclass of Animal) is safe.
    
      - **Contravariance** means that consumers are the opposite — if you can accept a `B`, then you can accept an `A` if $$A$$ is a subtype of $$B$$ (i.e., input types vary contravariantly). The coroutine can accept wider types for sending.
    
      **Consequently:**
    
      - `YieldType` and `ReturnType` are **covariant** (output positions).
    ​  - `SendType` is **contravariant** (input position).
    
      ***
    
      ### Supporting evidence from Python typing docs and references:
    
      From the official `typing` module documentation and the typing system used in Python:
    
      - The `Generator` and `Coroutine` generic types define variance as follows:
    
      ```python
      from typing import Generator
    
      # Generator[YieldType, SendType, ReturnType]
      # YieldType is covariant
      # SendType is contravariant
      # ReturnType is covariant
      ```
    
      This is also explicitly mentioned in the Python 3.12 typing docs [typing — Support for type hints](https://docs.python.org/3/library/typing.html):
    
      > "Note that unlike many other generic classes in the standard library, the `SendType` of `Generator` behaves contravariantly, not covariantly or invariantly."
    
      ***
    
      ### Mental model mapping for classic coroutines:
    
      | Role                     | Type Parameter     | Type Variance        | Reason / Interpretation                          |
      |--------------------------|--------------------|---------------------|------------------------------------------------|
      | Values *output* (yielded) | `YieldType`        | Covariant (produced) | Coroutine produces values of this type          |
      | Value passed *in* to `.send()` | `SendType`      | Contravariant (consumed) | Coroutine consumes values of this type from caller |
      | Value *returned* upon completion | `ReturnType`     | Covariant (produced) | Coroutine returns this type after finishing     |
    
      ***
    
      ### Intuition:
    
      - When you **consume** a coroutine, you can **send** values that are more general (superclasses, supertypes) than what the coroutine internally expects — because the coroutine must handle all these general cases safely.
    ​  - When a coroutine **yields** or **returns** values, it provides more specific types (subclasses or narrower types) back to the caller.
    
      ***
    
      ### Example to make it concrete
    
      Suppose you have:
    
      ```python
      from typing import Generator
    
      def coro() -> Generator[int, str, float]:
          # Yield ints
          val = yield 1
          # Accept strings sent via send
          ...
          return 3.14
      ```
    
      - `YieldType` = `int`: you produce integers.
    ​  - `SendType` = `str`: you accept strings sent to the coroutine.
    ​  - `ReturnType` = `float`: coroutine returns a float on completion.
    
      This fits the mental model that:
    
      - You produce (yield) integers (specific concrete type).
    ​  - You consume (accept via `.send()`) strings (input).
    ​  - You return a float (specific type).
    
      ***
    
      ### Summary
    
      Your mental model:
    
      - **Producers = covariant = YieldType, ReturnType**
    ​  - **Consumers = contravariant = SendType**
    
      is **correct and aligns perfectly with the Python typing system** for classic coroutines and generators.
    
      ***
    
      ### Supporting references for further deep-dive:
    
      - [Python typing module documentation - Annotating generators and coroutines](https://docs.python.org/3/library/typing.html#annotating-generators-and-coroutines) (2025-08-08)
    ​  - Fluent Python by Luciano Ramalho, "Classic Coroutines" chapter, which discusses these typing details around classic coroutines and variance.
    ​  - Python typing source and PEP 484 (Type Hints) discussing variance in generic types.
    ​  - Stack Overflow [The right way to type hint a Coroutine function?](https://stackoverflow.com/questions/73240620/the-right-way-to-type-hint-a-coroutine-function) detailing coroutine generic parameters and variance.
    
      ***
    
      Please let me know if you want me to expand on how this mental model impacts safe API design, variance variance violations, or practical examples of generic coroutine typing!
    
      [1] https://stackoverflow.com/questions/73240620/the-right-way-to-type-hint-a-coroutine-function
      [2] https://docs.python.org/3/library/typing.html
      [3] https://www.r-bloggers.com/2023/06/leveraging-generic-type-hints-of-classes-in-python/
      [4] https://www.fluentpython.com/extra/classic-coroutines/
      [5] https://www.reddit.com/r/Python/comments/10zdidm/why_type_hinting_sucks/
      [6] https://www.linkedin.com/pulse/python-generators-elegant-efficient-often-underused-peter-eldritch-a7faf
      [7] https://docs.python.org/3/reference/datamodel.html
    

Chapter Summary #

  • kiv native coroutines will come soon, the yield from is now just await in the native coroutine syntax.

Further Reading #