Skip to content## Recursive Generators

### Recursive Generators 101

##### The

##### Recursive Generators: The Matriyoshka Example

### Reimplementing

##### Recursive

##### Recursive

##### Recursive

### Discussion

— python, iterators, functional programming — 6 min read

This article provides alternative ways to implement `count`

, `cycle`

and
`repeat`

generator functions that are a part of the `itertools`

module in the
Python Standard Library (PSL). The proposed solutions are based on the
principle of *functional recursion* and are similar to design patterns used in
functional programming. The main message of the article is, however, not to
propose potential improvements to the PSL but to challenge programmers and open
them to a new way of thinking about iterators. It also goes to show that in the
context of iterators the idea of functional recursion comes naturally with the
use of the `yield from`

statement. Hence, it's very unfortunate that Python
doesn't have better support for recursion.

Generator functions are specific functions that contain `yield`

or `yield from`

statements inside their function body. Unlike regular functions, generator
functions return generator objects when they are called. If you feel that you
would like to brush up your knowledge about generators take a look at my other
article called Going Over Iterators and
Generators.

*Recursive generator functions* are a subset of regular generator functions.
More specifically, recursive generator function is defined by calling itself
inside the body of the same generator function. The main problem for achieving
recursion with generator functions is the fact that they return a generator
instance which awaits for the `next`

built-in function to call it in order to
produce a value. This behaviour obviously hinders recursion. Therefore,
calling a generator function returns a generator that needs to be unpacked in
order to establish recursion. We'll see that in Python this can be achieved
using the `yield from`

statement.

The following subsection introduces the concept of the `yield from`

statement
which is essential for unpacking iterators (generators). Afterwards, we'll use
it to establish recursive iteration in recursive generator functions.

`yield from`

Statement: The Unpacking MechanismThe `yield from`

statement was introduced to delegate subgenerators in the
context of coroutines (see PEP
380). However, when it's used in
the context of simple generators (non-coroutine objects) its usage becomes
simple. Therefore, for the purpose of this article consider the `yield from`

statement as a way to unpack iterable objects (objects with `__iter__`

or
`__getitem__`

dunder methods) until they raise the `StopIteration`

exception at
which point the unpacking stops. Consider the following generator function:

`1def unpack(iterable):2 yield from iterable`

As long as we make sure not to pass a coroutine as function argument, the above generator function behaves the same as the following:

`1def unpack_(iterable):2 for i in iterable:3 yield i`

Both `unpack`

and `unpack_`

are generator functions that accept a single
argument and return a generator. To show that the functions produce the same
results the following example unpacks a tuple and save it in a list:

`>>> c = (1, 2, 3,)>>> list(unpack(c))[1, 2, 3]>>> list(unpack_(c))[1, 2, 3]`

Now let's create an iterator and pass it to the `yield from`

:

`>>> i = iter(c)>>> list(unpack(i))[1, 2, 3]`

This shows that the behaviour of the `yield_from`

statement is the same for
containers (`tuple`

, `list`

, `range`

) and iterable objects. The `yield from`

statement provides a simple framework for unpacking iterable objects until the
`StopIteration`

exception signals to stop unpacking.

In the following example we'll look at the recursive nature of the Matriyoshka doll and create a recursive generator function that unpacks the doll.

We'll use `NamedTuple`

type to model the doll and specify the size of the doll
in the field called `size`

. This field reflects the number of dolls that the
original holds inside of it. The smallest doll has has the `size`

field equal
to zero (i.e., it doesn't contain other dolls). Here's the model for a
matriyoshka doll:

`1from typing import NamedTuple234class Matriyoshka(NamedTuple):5 size: int`

For example `Matriyoshka(size=2)`

is an object that models a doll that has two
dolls inside it, namely `Matriyoshka(size=1)`

and `Matriyoshka(size=0)`

objects. Note that with the relationship between the objects is weak in the
sense that the original doll only has a larger size but doesn't explicitly
enclose the smaller dolls.

Let's create a recursive generator function which recursively opens each doll
until the smallest doll is reached (i.e., doll that has `size`

equal to zero):

`1from typing import Iterator234def open_matriyoshka(m: Matriyoshka) -> Iterator[Matriyoshka]:5 yield m6 if m.size > 0:7 yield from open_matriyoshka(m._make([m.size - 1]))`

This function is a recursive generator function because it calls itself inside
of the function body. When the generator function is called with one argument
it returns a generator object. Generator objects are iterable and therefore
interact nicely with the `yield from`

statement. The `yield from`

statement
unpacks the generator which is created by calling `open_matriyoshka`

with a
doll of decremented size as a function argument. The decrementing ensures that
the doll size becomes smaller. The recursive iteration continues until the
*base case for recursion* is reached (i.e., until the algorithm decrements the
doll size sufficiently many times to reach the smallest doll with the `size`

field equal to zero). Once the base case is reached the recursion stops.

Let's test our recursive generator function and show that it can be used to
unpack `Matriyoshka`

objects:

`>>> next(g := open_matriyoshka(Matriyoshka(size=2)))Matriyoshka(size=2)>>> next(g)Matriyoshka(size=1)>>> next(g)Matriyoshka(size=0)>>> next(g)Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration`

Generator `g`

is created after calling `open_matriyoshka`

with an argument
`Matriyoshka(size=2)`

, i.e., object that represents a doll which has two dolls
inside it. Each time the `next`

built-in function is called on the generator
`g`

the code inside the generator function runs until the next `yield`

statement and produces a value which follows it. When the code reaches the
`yield from`

statement it creates a new generator and continues searching for
the next `yield`

statement inside it. When there are no more `yield`

statements
the original generator `g`

is exhausted and the `StopIteration`

exception is
raised.

In the following section we'll use the recursive mechanism to rewrite three
functions from the `itertools`

module.

`count`

, `cycle`

and `repeat`

Functions`count`

, `cycle`

and `repeat`

are iterator factories defined in the `itertools`

module that are used for creating well-optimized iterators that are both fast
and memory efficient. The functions are implemented in C programming language
and live inside the
itertoolsmodule.c
file. For the sake of completeness every function from the module has an
equivalent Python implementation which complements the
documentation
of the module. Before reimplementing these functions we'll cover the basic use
cases and the way they are implemented according to the Python documentation:

`count`

creates an infinite counter`>>> count(10) --> 10 11 12 13 14 ...`

`1def count(start=0, step=1):2 n = start3 while True:4 yield n5 n += step`

`cycle`

indefinitely cycles over an iterable object`>>> cycle("ABC") --> A B C A B C A ...`

`1def cycle(iterable):2 saved = []3 for element in iterable:4 yield element5 saved.append(element)6 while saved:7 for element in saved:8 yield element`

`repeat`

repeats an object infinitely or finite number of times`>>> repeat(10, 3) --> 10 10 10`

`1def repeat(object, times=None):2 if times is None:3 while True:4 yield object5 else:6 for i in range(times):7 yield object`

Note that these functions share the same design pattern. They instantiate
infinite iterators by dropping into an infinite `while`

loop (highlighted lines
in the above code examples). This ensures that the iterators always have a
`yield`

statement that produces a new value and that the iterator is unable to
raise the `StopIteration`

exception (i.e., to get exhausted). This particular
design, using an infinite `while`

loop, is an obvious solution that a lot of
programmers in the context of imperative programming are comfortable with.

Next we'll put on our "functional-programming" hats and reimplement these functions in the form of previously introduced recursive generator functions.

`count`

FunctionThe `count`

function is probably the simplest one to implement:

`1from typing import Union23Number = Union[int, float]456def recount(x: Number, step: Number = 1) -> Number:7 yield x8 yield from recount(x + step, step)`

When this generator function is called it creates a generator object. The
generator simply yields the first argument that is passed when the generator
function is called. Then it creates a new generator with an incremented value.
The new generator is unpacked using the `yield from`

statement. The code
continues to search through the newly created generator for the next `yield`

statement. That way new values are being produced. Each time a new value is
produces a new generators with an incremented value is created. Here's an
example:

`>>> r = recount(10, 2)>>> next(r)10>>> next(r)12>>> next(r)14>>> next(r)16`

We'll use the same idea to reimplement the `cycle`

and `repeat`

functions.
First the generator needs to yield a value and than it recursively creates a
new generator which gets unpacked.

`cycle`

FunctionLooking at the `cycle`

function you'll notice that there is a quirk. Before
`cycle`

drops into the infinite loop it saves elements from the iterable into a
`list`

type object. This ensures that the iterable over which the code cycles
doesn't get exhausted. This is not needed for container-type objects (e.g.,
`list`

, `tuple`

, `range`

) but is important for iterators and generators since
they get exhausted once there are no more `yield`

statements to produce new
values. To create recursive generator function we'll use a similar trick and
make a copy of the iterator. The idea is to exhaust one iterator copy and use
the other as input for the recursive generator function:

`1from itertools import tee2from typing import Iterable345def recycle(iterable: Iterable) -> Iterable:6 i, i_copy = tee(iterable)7 yield from i_copy8 yield from recycle(i)`

If the function didn't create a copy it would have only worked for container-type objects. This implementation, however, is able to infinitely cycle over any iterable object. We can test this by combining two recursive generator functions:

`>>> g = recycle(open_matriyoshka(Matriyoshka(size=2)))>>> next(g)Matriyoshka(size=2)>>> next(g)Matriyoshka(size=1)>>> next(g)Matriyoshka(size=0)>>> next(g)Matriyoshka(size=2)`

On the downside this recursive generator function comes with a performance overhead because it makes a copy of the iterable each time it enters a new cycle.

`repeat`

FunctionExamples of `recount`

and `recycle`

have shown how to create infinite recursive
generators. Reimplementation of `repeat`

poses a new problem. Since `repeat`

can create finite iterators the recursive generator function requires a base
case to stop the recursion. Take a look at the following implementation:

`1from typing import Iterator, TypeVar23T = TypeVar('T')456def rerepeat(x: T, times: int = 0) -> Iterator[T]:7 yield x8 if (new_times := times - 1) != 0:9 yield from rerepeat(x, new_times)`

Generators created by calling `rerepeat`

yield the first argument used during
the function call. Afterwards the `times`

argument gets decremented. For the
generators with positive `times`

the decrementing means that the value becomes
closer to one. When `times`

becomes one the base case is reached. The `if`

statement doesn't activate and the generator stops unpacking new generators.
However, for the generators with negative `times`

(or zero) the decrementing
leads to even smaller numbers which drive it away from the base case.
Consequently, the generator with `times`

lower or equal to zero are infinite
generators.

Here's a taste of what we can do with the `rerepeat`

function:

`>>> '-'.join(rerepeat("NANA", 8)) + " BATMAN!"'NANA-NANA-NANA-NANA-NANA-NANA-NANA-NANA BATMAN!'`

This works only because the `times`

argument is positive and the iterator is
finite. However, testing the same example with an infinite iterator would blow
up in our face!

The concept of functional recursion is applicable in the context of iterators
and results in the so-called *recursive generator functions*. We've shown that
functional recursion is a natural approach for creating both finite and
infinite iterators using the simplest use case of the `yield from`

statement.
It provides the *unpacking mechanism* for iterable objects and establishes the
flow of recursive iteration. This circumvents the need for loops in generator
functions.

We've used the concept of functional recursion to reimplemented pure functions
from the `itertools`

module. These exercises have shown how to set base case
for recursion. While implementing the `recycle`

function it was pointed out
that container-type objects, unlike iterators, cannot be exhausted. Therefore,
careful consideration has to be made to reflect the use case when designing a
recursive generator function.

It is well-known that recursion in Python is a crippled tool since it builds on top of the C stack. This is very unfortunate because it limits the application of recursive generator function. Until the problem with recursion gets solved recursive generator will only be a mental exercise aimed to help programmers expand their way of thinking in terms of tools that are used in functional programming.