Stream
Learning Objectives
At the end of this sub-unit, students should
- understand the peculiarities built-in
map
andfilter
. - know how to avoid pitfalls related to
map
andfilter
.
A Deeper Look
There are quite a number of things we did not explain properly about the built-in map
and filter
.
This is hidden in the way we were using them, but the more perceptive of you probably realized that we were only using it in a for-loop.
If you have tried it our on your own believing that the result of map
and filter
are sequence, you would have been disappointed because it is not.
Consider the simple case of indexing.
We get the following error if we try to index the result of map
or filter
.
Map Indexing
Filter Indexing
This clearly shows that the result of map
and filter
are not sequence.
They are definitely not list or tuple.
So what are they?
Recap that we can check the type using the type(...)
function.
Let us check.
That does not help much. But it at least gives us a glimpse that it is not a sequence. The question is, what is it exactly? We did not have a name for this previously, so let us give it a name. Recap the operation that we can do with the result: we can iterate with for-loop. So it is reasonable to name this type an iterable.
Iterable
Consider a data type \(T\) with a variable var
of type \(T\).
We can say that \(T\) is an iterable if the following operations can be performed assuming valid input.
- Iteration: It can be used in a new construct called for-loop.
- Without the
for
keyword, it will be checking if an element is inside the sequence (i.e.,elem in seq
).
- Without the
So the result of map
and filter
is an iterable1.
If we can arrange things in hierarchy, it will look like the following.
Note that map
and filter
are special kinds of iterables, it is a
Single Use
So we know that map
and filter
are at least an iterable.
It is actually a special kind of iterable.
The more appropriate name given in Python convention is a generator.
However, we will use the name stream as it is the more commonly accepted name in general.
A stream is an iterable that is single use. It is a like looking at a flowing stream of water. Once the stream flows, if there is nothing that replenish the stream (e.g., a continuous water source), then the stream will dry up. It can no longer be used.
The same thing is true for map
and filter
.
One the data is used up (e.g., in a for-loop), then the next time we try to use it again, it will be empty.
We can illustrate this with the example below.
Note how the second for-loop on the same result of map
and filter
does not have any output.
Map
Why Single Use?
The reason why both map
and filter
are single use is that it is computed lazily.
In other words, unless the data is actually used, nothing is computed.
Due to this, once a value is computed and used, to save space, it is immediately discarded.
Try running the following and see extreme difference in the timing.
Reusing Map and Filter
Given this limitation of single use, what can we do to preserve the value?
At the beginning, we mentioned that the type (e.g., int
, float
, etc) can be used to _convert from one data type to another.
This is a very useful lesson because it will also work for list
and tuple
.
And that is exactly how the result from map
or filter
can be made permanent.
We can convert them into list
(or tuple
) and assign it to a variable.
Map
With this, we can give a more appropriate solution for map_n
and filter_n
from previous sub-unit.
For good measure, to avoid potential errors in the future, always convert the result of map
and filter
into list
or tuple
.
The only reason not to convert them is efficiency reason.
But we are more concerned with correctness at this level.
-
To be even more precise,
map
andfilter
are actually another data type. They are constructors rather than functions. We can see this from the output<class 'map'>
. ↩