Map and Filter
Learning Objectives
At the end of this sub-unit, students should
- understand the built-in
map
andfilter
. - know how to use built-in
map
andfilter
.
Preliminary
We have had a rudimentary implementation of our own map
and filter
.
However, this requires a complicated tricks which we did not explain.
The idea remains the same, we want to either apply a function F
to all elements for mapf
or select elements using a predicate P
for filterf
.
There is already a built-in function called map
and filter
implemented in Python1.
Instead of giving the implementation, we will describe the behavior visually.
This is because the built-in map
and filter
uses higher-order function we have not covered yet.
Function as Value
The basic technique we will be using in this sub-unit is the idea that function is really just like any other values.
What can we do with any value, what operations can we do?
Can we use the operator +
?
Unfortunately no, if our value is a range
, we cannot operate with +
.
The most general thing we can do with any value is that we can assign that to a variable. We did this before with lambda. We assigned a lambda into a variable. Because any value can be assigned into a variable, whatever a variable of any type can do, a function can also do. This includes the following.
- A function can be passed as an argument to another function.
- A function can be stored inside a sequence (e.g.,
tuple
,list
, etc).
Additionally, a function can be invoked. So a function invocation is a special operation that a function can do.
Function as Value
We will give the example for passing functions as arguments in the next section.
Map
The idea of a mapping is to transform each element in a sequence into another element using an arbitrary function F
.
We can describe the behavior of the built-in map
using the following image.
The function F
is supplied to the built-in function map
as an argument.
We can do this because a function or lambda is just a value.
The simplified definition of map
is as follows.
map(function, iterable)
An iterable is simply an object that can be iterated using for-loop.
This includes sequences.
We will be using map
and filter
mainly for sequences.
What we really want to know is how to use map
.
Consider the previous problem of multiplying a list of integer by n
.
We can solve this with map
as follows.
Let us look at it more closely.
We know that the lambda lambda item: item * n
is a straightforward function because it uses only its parameter (i.e., item
) and the variable from its scope (i.e., n
),
It also does not update anything.
Now that is out of the way, based on the behavior of map
above, we can see that the call map_n([1, 2, 3, 4], 5)
will be performing the following operation.
So based on this explanation, we should be getting the list [1, 10, 15, 20]
.
We can test this.
So now, if our task is to operate each element in a list, we can use map
as long as we can define the operation using function.
This function may be created using lambda
to use the local variable as long as the function is a straightforward function.
Take for instance, the case of computing the factorial of each element in a list.
Our F
should be factorial.
The recursive version of factorial
is reproduced below.
Note the order of arguments above.
The first argument is a function and the second argument is a sequence.
How is the first argument a function?
Well, it is simply the function name.
It is the same as if we are doing f = factorial
and invoking map(f, [1, 2, 3, 4, 5])
.
Filter
So that is the behavior of map
.
What about filter
?
Unlike map
, filter
takes in a predicate P
.
This predicate returns a boolean value.
If the result of the P(elem)
is True
, we keep the elem
.
Otherwise, we remove the elem
from the result.
Since we need to make the result compact, there is a compactification process that removes the actual element so that the resulting indexes are consecutive.
We can now solve the problem of keeping only multiple of n
using filter
as follows.
Let us try it on filter_n([0, 1, 2, 3], 3)
.
The diagram is shown below.
Do not forget to also test this yourself to see the result.
What other thing can we do with filter
?
Let us try something more complicated.
Consider the following ADT for student called student
.
Can you specify the type annotation?
Student
Now, let's say we are given a list of student in a class. We want to give a letter of commendation for students with gpa of 4.6 and above. How do we get these students? The straightforward function is given as the following lambda.
lambda std: get_gpa(std) >= 4.6
Let us test this behavior.
Common Mistakes
Because map
and filter
are often introduced together, there are common mistakes that people often makes.
Some of the mistake occurs because the two functions are so similar in their definition.
The first argument is a function2 and the second argument is an iterable3.
So it is quite common to see that map
is used when filter
is intended and vice versa.
What is the effect of that?
In the case of using map
instead of filter
, we are using a predicate in place of a general function.
But a predicate is a function.
So in the result, we will see booleans.
Map Instead of Filter
On the other hand, if we use filter
instead of map
, the result may be indecipherable.
This is related to another bad practice involving truthy/falsy values.
Although filter
is expecting a predicate which is supposed to evaluate to boolean values True
or False
, if the values supplied are not boolean, it will be forced into a boolean value.
So it is as if the values are treated like boolean.
Hence, instead of True
or False
, we have truthy/falsy values.
In the case the value is truthy, we keep the element.
Otherwise, we remove the element.
Consider the simple case of transforming each element into its remainder when divided by 2.
We can use the following lambda lambda item: item % 2
.
When this lambda is passed as argument to filter, we have two cases.
- The remainder is
0
.0
is falsy, so the element is removed.
- The remainder is
1
.1
is truthy, so the element is kept.
This is the opposite of checking if the value is even (i.e., lambda item: item % 2 == 0
).
It is in fact, equivalent to checking if the value is odd (i.e., lambda item: item % 2 == 1
).
Of course this works nicely for divisibility by 2.
On divisibility by 3, we are keeping only values that are not divisible by 3.
Filter Instead of Map
But this is only to highlight the potential confusion when we mistakenly use filter
instead of map
.
The remedy is to always design your program well because often debugging these problems will not be easy.
Basic Properties
Using the diagram for map
and filter
, we can describe some basic properties about these functions.
These basic properties will be important when we are dealing with multi-dimensional sequences.
-
Map
-
Filter
-
- The number of elements is unchanged.
- The value may be updated.
- The order is unchanged.
-
- The number of elements may be fewer.
- The value is unchanged (if not removed).
- The relative ordering is unchanged (if not removed).