Map and Filter

Learning Objectives

At the end of this sub-unit, students should

understand the built-in map and filter.
know how to use built-in map and filter.

Preliminary

We have had a rudimentary implementation of our own map and filter. However, this requires a complicated tricks which we did not explain. The idea remains the same, we want to either apply a function F to all elements for mapf or select elements using a predicate P for filterf. There is already a built-in function called map and filter implemented in Python¹. Instead of giving the implementation, we will describe the behavior visually. This is because the built-in map and filter uses higher-order function we have not covered yet.

Function as Value

The basic technique we will be using in this sub-unit is the idea that function is really just like any other values. What can we do with any value, what operations can we do? Can we use the operator +? Unfortunately no, if our value is a range, we cannot operate with +.

The most general thing we can do with any value is that we can assign that to a variable. We did this before with lambda. We assigned a lambda into a variable. Because any value can be assigned into a variable, whatever a variable of any type can do, a function can also do. This includes the following.

A function can be passed as an argument to another function.
A function can be stored inside a sequence (e.g., tuple, list, etc).

Additionally, a function can be invoked. So a function invocation is a special operation that a function can do.

Function as Value

def add(x, y):
  return x + y
def sub(x, y):
  return x - y

also_add = add
func = [add, sub]

>>> add(1, 2)
3
>>> also_add(2, 3)  # add(2, 3)
5
>>> func[1](10, 1)  # sub(10, 1)
9

We will give the example for passing functions as arguments in the next section.

Map

The idea of a mapping is to transform each element in a sequence into another element using an arbitrary function F. We can describe the behavior of the built-in map using the following image.

Map01

The function F is supplied to the built-in function map as an argument. We can do this because a function or lambda is just a value. The simplified definition of map is as follows.

map(function, iterable)

An iterable is simply an object that can be iterated using for-loop. This includes sequences. We will be using map and filter mainly for sequences.

What we really want to know is how to use map. Consider the previous problem of multiplying a list of integer by n. We can solve this with map as follows.

Multiply by n

def map_n(lst, n):
  return map(lambda item: item * n, lst)

Let us look at it more closely. We know that the lambda lambda item: item * n is a straightforward function because it uses only its parameter (i.e., item) and the variable from its scope (i.e., n), It also does not update anything.

Now that is out of the way, based on the behavior of map above, we can see that the call map_n([1, 2, 3, 4], 5) will be performing the following operation.

Map02

So based on this explanation, we should be getting the list [1, 10, 15, 20]. We can test this.

for elem in map_n([1, 2, 3, 4], 5):
  print(elem)

So now, if our task is to operate each element in a list, we can use map as long as we can define the operation using function. This function may be created using lambda to use the local variable as long as the function is a straightforward function. Take for instance, the case of computing the factorial of each element in a list. Our F should be factorial. The recursive version of factorial is reproduced below.

def factorial(n):
  if n == 0:
    return 1
  else:
    return n * factorial(n - 1)

>>> for elem in map(factorial, [1, 2, 3, 4, 5]):
...   print(elem)
...
1
2
6
24
120

Note the order of arguments above. The first argument is a function and the second argument is a sequence. How is the first argument a function? Well, it is simply the function name. It is the same as if we are doing f = factorial and invoking map(f, [1, 2, 3, 4, 5]).

Filter

So that is the behavior of map. What about filter? Unlike map, filter takes in a predicate P. This predicate returns a boolean value. If the result of the P(elem) is True, we keep the elem. Otherwise, we remove the elem from the result. Since we need to make the result compact, there is a compactification process that removes the actual element so that the resulting indexes are consecutive.

Filter01

We can now solve the problem of keeping only multiple of n using filter as follows.

Keep Multiple of n

def filter_n(lst, n):
  return filter(lambda item: item % n == 0, lst)

Let us try it on filter_n([0, 1, 2, 3], 3). The diagram is shown below.

Filter02

Do not forget to also test this yourself to see the result.

for elem in filter_n([0, 1, 2, 3], 3):
  print(elem)

1
2

0
3

What other thing can we do with filter? Let us try something more complicated. Consider the following ADT for student called student. Can you specify the type annotation?

Student

def make_student(name, matric, gpa):
  return (name, matric, gpa)

def get_gpa(std):
  return std[2]

Now, let's say we are given a list of student in a class. We want to give a letter of commendation for students with gpa of 4.6 and above. How do we get these students? The straightforward function is given as the following lambda.

lambda std: get_gpa(std) >= 4.6

Let us test this behavior.

students = [
  make_student('Adi'   , 'A0000001A', 5.0 ),
  make_student('Ashish', 'A0000002B', 4.5 ),
  make_student('Nitya' , 'A0000003C', 4.8 ),
  make_student('Daren' , 'A0000004D', 4.59),
  make_student('Sanka' , 'A0000004D', 4.3 )
]

>>> for std in filter(lambda std: get_gpa(std) >= 4.6, students):
...   print(std[0])
...
Adi
Nitya

Common Mistakes

Because map and filter are often introduced together, there are common mistakes that people often makes. Some of the mistake occurs because the two functions are so similar in their definition. The first argument is a function² and the second argument is an iterable³.

So it is quite common to see that map is used when filter is intended and vice versa. What is the effect of that? In the case of using map instead of filter, we are using a predicate in place of a general function. But a predicate is a function. So in the result, we will see booleans.

Map Instead of Filter

>>> for elem in filter(lambda item: item % 2 == 0, [1, 2, 3, 4]):
...   print(elem)
...
2
4
>>> for elem in map(lambda item: item % 2 == 0, [1, 2, 3, 4]):
...   print(elem)
...
False
True
False
True

On the other hand, if we use filter instead of map, the result may be indecipherable. This is related to another bad practice involving truthy/falsy values. Although filter is expecting a predicate which is supposed to evaluate to boolean values True or False, if the values supplied are not boolean, it will be forced into a boolean value. So it is as if the values are treated like boolean. Hence, instead of True or False, we have truthy/falsy values.

In the case the value is truthy, we keep the element. Otherwise, we remove the element. Consider the simple case of transforming each element into its remainder when divided by 2. We can use the following lambda lambda item: item % 2. When this lambda is passed as argument to filter, we have two cases.

The remainder is 0.
- 0 is falsy, so the element is removed.
The remainder is 1.
- 1 is truthy, so the element is kept.

This is the opposite of checking if the value is even (i.e., lambda item: item % 2 == 0). It is in fact, equivalent to checking if the value is odd (i.e., lambda item: item % 2 == 1). Of course this works nicely for divisibility by 2. On divisibility by 3, we are keeping only values that are not divisible by 3.

Filter Instead of Map

>>> for elem in map(lambda item: item % 2, [1, 2, 3, 4]):
...   print(elem)
...
1
0
1
0
>>> for elem in filter(lambda item: item % 2, [1, 2, 3, 4]):
...   print(elem)
...
1
3

But this is only to highlight the potential confusion when we mistakenly use filter instead of map. The remedy is to always design your program well because often debugging these problems will not be easy.

Basic Properties

Using the diagram for map and filter, we can describe some basic properties about these functions. These basic properties will be important when we are dealing with multi-dimensional sequences.

Map
Filter

- The number of elements is unchanged.
- The value may be updated.
- The order is unchanged.
- The number of elements may be fewer.
- The value is unchanged (if not removed).
- The relative ordering is unchanged (if not removed).

We will differentiate the built-in function map and filter by using color. If the color is purple, it means we are using the built-in version. ↩
Predicate is also a function. ↩
A sequence is also an iterable. ↩