— python — 4 min read
In Python the for
statement is an important tool used for specifying
iteration. This article (literally) goes over the for
statement and uncovers
the mechanism that makes it tick. Before disassembling the for
statement
we'll first look at the way it is used and from there slowly start diving into
the internals.
for
Statement 101The for
statement acts as an interface for iterating over elements
inside objects. The most obvious way for a C programmer to iterate
over an object in Python is the following:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> for i in range(len(planets)):... print(planets[i])...MercuryVenusEarthMars
The idea here is to create an index variable i
from an arithmetic progression
using the built-in range
function which goes from zero up to but not
including len(planets)
. The index variable is used with the container object
in following way planets[i]
to retrieve the elements one by one in each
iteration.
The problem with the previous example is that there are several things
happening at the same time making it hard to separate the "business logic" in
the code from the boilerplate required for iterating. In order to iterate over
elements of a container object programmer has to build an arithmetic
progression and access the element by referring to index. In a situation with
nested for
statements the code becomes harder to read and debug. For this
reason, Python provides a better way to iterate over objects.
Let's rewrite the previous code block in a more Pythonic way:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> for planet in planets:... print(planet)...MercuryVenusEarthMars
Now this is much better because the "business logic" inside the loop is clear.
Notice how the for
loop actually behaves as the foreach
loop where Python simply knows how
to iterate over every element inside the planets
without programmer having to
refer to indexes. We've introduced variable planet
which is an iteration
variable and gets updated in each iteration with a new element from the for
loop. This behaviour removes the noise of building an arithmetic progression
and referring to objects with indexes. At the same time it gives a more
intuitive iteration experience since the code can be read as pseudo code.
In terms of performance looping_performance.py is a script that measures the execution time of the two examples and shows that the second example preforms significantly faster than iterating using indexes.
for
Statement With IterablesUntil now I've deliberately avoided talking about the type of objects which
were passed into the for
statement. At this point, however, I'd like to focus
on the following question: "What type of objects can we pass into the for
statement?"
In the previous examples, we've ignorantly passed two different types of
objects into the for
statement (range
1 and tuple
) and assumed that
Python knows how to iterate over these objects. That being said, what would
happen if we try to iterate over an object of type integer (int
)? Let's take
a look at the following example:
>>> for number in 42:... print(number)...Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'int' object is not iterable
Here we get an error message! More specifically, Python gives us a TypeError
saying that it doesn't know how to iterate over int
objects because
apparently int
is not an iterable. With a bit of help from Python, in this
example we came to the conclusion that in order for an object to get along with
the for
statement it has to be iterable.
An iterable is an object that knows how to return its members one at a time.
Iterable objects are required to implement either __iter__
or __getitem__
methods. These methods tell Python the order in which it should return elements
inside the iterable. Many of the default types such as list
, tuple
, dict
and others are iterable and we don't have to worrying about implementing their
dunder methods, we simply pass them into the for
statement.
In order to see if an object is iterable we can call the iter
built-in
function and pass the object as an argument. Here's an example of passing an
object of type list
as an argument inside the iter
function:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> iter(planets)<tuple_iterator object at 0x7f83e198bbe0>
We can see that this call was successful because we didn't get any error! However, if we try doing the same with an integer argument we get the following message:
>>> iter(42)Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'int' object is not iterable
Notice that this error message is identical to the one we got when we passed an
int
to the for
statement!
for
StatementPython has no way of knowing how to iterate over an object passed to the for
statement and therefore it calls the iter
built-in function with the object
as an argument to get help. This is the reason we get the exact same error
message when passing int
object to the for
statement and the iter
function.
The iter
function searches the __iter__
and __getitem__
methods inside
the object to find the order in which elements should be iterated over and
returns a very simple object called iterator (not to be mistaken for
iterable). The iterator is responsible for telling the for
statement exactly
how to iterate over the iterable object through a method called __next__
. An
iterator object can be called with Python's next
function to return an
element from the iterable that was used for creating the iterator. Once all the
elements from have been returned next
function raises a StopIteration
exception.
Hopefully, at this point you understand that the for
statement is top of an
iceberg when it comes to iteration. Under the surface there are functions like
iter
and next
that do a lot of work. Therefore, the for
statement is
considered more of an interface for interacting with iterable objects. We can
rewrite the example with planets
and expose the mechanism behind the for
statement:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> iplanets = iter(planets)>>> while True:... try:... planet = next(iplanets)... except StopIteration:... break... else:... print(planet)...MercuryVenusEarthMars
Let's go over the details one more time: planets
is an iterable object which
means that it can be iterated over since it implements __iter__
or
__getitem__
method. Calling the iter
function on an iterable object returns
an iterator with __next__
method. Afterwards, the iterator object is passed
to the next
function inside an infinite loop and it returns elements from the
planets
object. After the iterator returned every object it gets exhausted
and raises a StopIteraton
exception at which point Python exits the infinite
loop.
else
ClauseThe else
clause that follows the for
statement is not a well-known or used
feature but it is a part of Python's flow control. The previously described
mechanism need to be slightly modified in order to support the else
clause.
Here's how the code looks with the else
clause:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> iplanets = iter(planets)>>> _loop = True>>> while _loop:... try:... planet = next(iplanets)... except StopIteration:... _loop = False... else:... print(planet)... else:... print(f"The `break` statement didn't interrupt the loop!")...MercuryVenusEarthMarsThe `break` statement didn't interrupt the loop!
In case of the while
statement Python runs the code inside the else
clause
only when the logical expression in the while
loop becomes False
.
Therefore, we only have to changed the way of exiting the infinite loop:
instead of using the break
statement in this case the loop terminates with
_loop = False
.
With this implementation, the else
clause shouldn't run when we use the
break
statement inside the loop body. Let's add only two lines which include
the break
statement in the loop of the code and test if the else
clause
runs:
>>> planets = ("Mercury", "Venus", "Earth", "Mars",)>>> iplanets = iter(planets)>>> _loop = True>>> while _loop:... try:... planet = next(iplanets)... except StopIteration:... _loop = False... else:... print(planet)... if planet == "Earth":... break... else:... print(f"The `break` statement didn't interrupt the loop!")...MercuryVenusEarth
The test uses the break
statement to exit the loop when the iteration reaches
the "Earth"
object and the else
clause doesn't run. Our test was a success!
We've seen that the for
statement represents syntactic sugar which acts as an
interface for interacting with iterable objects. Additionally, we've exposed
the underlying mechanism and showed that under the surface the for
statement
calls the iter
and next
built-in functions. Hopefully, you've realized that
the iterator protocol lies in the heart of Python's iteration.
range
function returns an object of type list
while
in Python 3.x it returns an object of type range
.↩