Welcome to part 4 of the intermediate Python programming tutorial series. In this part, we're going to talk about list comprehension and generators.
To begin, let's show a quick example and reason for both. A generator that is used commonly is Python 3's range()
generator (Python 2's xrange
).
If you want iterate through something 4 times, you might say something like:
for i in range(4): do_something
What's range() doing here? In this case, it's a generator, so it's generating the values in order on-the-fly. An example of a generator expression:
xyz = (i for i in range(50000000)) print(list(xyz)[:5])
An example of list comprehension:
xyz = [i for i in range(50000000)] print(xyz[:5])
These look and appear to act very similarly, but they are quite different under the hood. First, with a generator, the values are generated from an original input, but the values are not copied and instead are generated on-the-fly. This means we will use far less memory, since the entire list is not processed all at once, but also means the process is a bit slower, since things are indeed generated as we go.
The list comprehension puts the entire list into memory, so it is faster, but the penalty is memory use.
Thus, generally, you will use generators for huge ranges/sequences (including infinite ones), and otherwise use list comprehension. Notice that, when we wanted to output the first 5 items in xyz
in the case of the generator, we first had to actually convert it to a list, since, it's just a generator object. The list comprehension example was already a list.
In many cases, you wont necessarily be worried about memory or speed, but, if you are working on mobile devices with less power, or maybe you really are working with huge datasets, where you don't actually need the entire list at once, these distinctions matter. Remember too, Python normally works on a single CPU, so it's natively only as powerful as a single CPU.
In the next tutorial, we're going to talk more on list comprehension and generators.