✢ Memoization
Learning Objectives
At the end of this sub-unit, students should
- appreciate the similarity between data and computation.
Pure Function as Data
In the previous sub-unit, we encountered a curious fact that a mapping table can be represented as either one of the following.
- A function with a sequence of if-statements.
- A dictionary which is a mapping from key to value.
Since the table is finite in this case, we have a complete mapping.
-
As Function
-
As Dictionary
Can this be generalized further? At some point, the dictionary may be very large so it may not be worth it. In fact, for some operation such as add 1, the table is theoretically infinite but the code is rather trivial.
-
As Function
-
As Dictionary
Ok, so there is a similarity between data and computation, but what is the use of this?
We can speed up a computation by storing the result of a repeated computation.
To do this, we first represent a function f(param1, param2)
as a mapping from (param1, param2)
to some value.
This only works for pure function where the result is purely dependent in the input parameter.
So to represent this as a dictionary, we set the key as the tuple (param1, param2)
.
If we have more parameters, we simply add more values.
To avoid pre-computing all the possible mappings, we will add the result only after we compute at least once.
So the next time we invoke the function f
with the same parameter, we do not need to perform the computation.
We can simply retrieve the result from the dictionary.
Let us illustrate this with the factorial function. The original recursive factorial function is reproduced below.
The conversion to the memoized version will be done as follows.
- Before we perform the computation, we check if the parameter has already been computed before.
- We check this by looking at the key in the dictionary.
- If it has been computed, we simply return the corresponding value from the dictionary.
- Otherwise, we perform the computation.
- At the end of the computation, we update the dictionary by adding the mapping from the current parameter to the corresponding result.
Memoized Factorial
Fibonacci
This idea of memoization is useful for non-linear recursion especially non-linear recursion where there can be repeated computation. Take for instance the Fibonacci series represented by the following recursive formula.
We can write the code as follows.
We can see here that fib(n - 2)
will be computed at least twice because when we compute fib(n - 1)
, we will continue with the following: fib(n - 1) ⇝ fib(n - 2) + fib(n - 3)
.
So it will be very very useful if we can memoize the value of fib(n - 2)
.
This way, we do not have to recompute it the next time we encounter it.
Memoized Fibonacci
Memo as List
Our current memo uses a dictionary. But our parameter for Fibonacci is actually an integer. So do we really need a dictionary? Can we use a list? In this case, yes. The function is as followed.
List Fibonacci
Now that we write our Fibonacci this way, we can see that fib(n)
will eventually create a list of size n + 1
where the value at index n
is exactly the solution to fib(n)
.
Notice that fib(n)
requires results from fib(n - 1)
and fib(n - 2)
.
These results will eventually be stored in index n - 1
and n - 2
.
Hence, if we fill in the list from the smallest value to the largest value, we can fill in the memo from left to right.
This gives us the following code. In the following code, we put the memo inside the function.
Dynamic Fibonacci
Now we can see that we only see need the previous two values at each iteration (i.e., MEMO[-1]
and MEMO[-2]
).
So we do not need the entire list.
In fact, we only need two variables.
This gives us the following iterative solution for Fibonacci.