Abstraction
Learning Objectives
At the end of this sub-unit, students should
- appreciate the benefit of procedure abstraction.
- know how to solve problems using basic abstraction.
Good Abstraction
Let us look back at the code for finding the distance between two points (0, x) and (y, 0). This is also known as the hypothenuse and computed using the function we have defined before.
Hypotenuse v1
The most basic question we can ask is, why do we solve it this way? Why not the shorter code below.
There are several good reasons for this, we will explain 6 of them here. Some of the explanation will be further expanded into problem solving technique in future units.
- Makes programs easier to understand.
- Captures common patterns.
- Allows for code reuse.
- Hides Irrelevant Details
- Makes fixing errors easier.
Makes Programs Easier to Understand
The computation of the hypotenuse might be a bad example because the code is rather straightforward. We can say that the shorter version "Hypotenuse v2" is easier to understand. But consider the following two codes. We use single letter variables to further illustrates how bad it is when exacerbated by writing all the code in a single function.
-
Mystery v1
-
Mystery v2
To understand the code on the left, we have to understand the code in its entirety.
On the other hand --for the code on the right-- if we understand what the function fn
is doing, we can understand the behavior of the mystery function much easier.
In fact, fn
is exactly the factorial function.
So now, the mystery function can be thought of as a summation for the following form.
This comes from the understanding that the fn(i)
computes \(i!\).
Which means, (x ** i) / fn(i)
computes \(\frac{x^i}{i!}\) for a given value of \(i\).
Therefore, the while-loop is a summation where the result is stored in the variable y
of the expression above.
Captures Common Patterns
There is another lesson we can learn from the mystery function above.
We know that expression it computes is the summation of \(\frac{x^i}{i!}\).
But the value of \(i!\) is computed using the function fn(i)
.
Does it need to be the factorial function?
Here, we can see that the true expression that is computed by the mystery function is in fact \(\frac{x^i}{\text{fn}(i)}\).
It just so happen that \(\text{fn}(i)\) is exactly the factorial function.
So if we want to compute the sum of \(\frac{x^i}{i}\) instead, then we do not even have to modify the mystery function.
We can simply modify fn
to the following.
We can take this a step further. Can we abstract out the mystery function even further? Using the idea of a function, we can even say that we want to have a function \(\text{f}(i)\) so that we can rewrite the mystery function as the code on the left and the expression it computes is given on the right.
-
Abstracted Mystery
-
Expression Computed
\[ \text{summation}(x, n) = {\displaystyle \sum_{i=0}^{n} \text{f}(i)} \]
This is a much more useful function. In fact, it is so useful we rename it immediately into summation
.
Almost any summation we want can be computed with this code.
What we need to do is simply to write the appropriate code for f(n)
.
Of course this is still not the most general.
Making it more general will require more complicated function.
But even with this, we can compute many things.
At the very least, we can compute the above easily.
We can also compute the following by defining the appropriate f
.
Allows for Code Reuse
Let us assume that we have written the solution for the hypotenuse as "Hypotenuse v1".
We have defined a number of functions here: hypot(x, y)
, sum_sqr(x, y)
, sqrt(x)
, and sqr(x)
.
If we have another problem at hand, say computing the area of a circle given a radius r
, do we have to solve this from scratch again?
The formula for the area is given below.
Which of the functions above can we use?
Since there \(r^2\), we can reuse the sqr
function that we have written before.
Of course this seems like a trivial example because we can also use r * r
or r ** 2
, but we can further extend this.
The surface of a ball is given by the formula \(4 \pi r^2\) and we know that \(\pi r^2\) is the area of a circle.
So we can now solve the following easily.
This process can be continues further. In fact, if we have a book containing solution to simpler problems, we can then try to solve our current problem assuming we have a solution to the simpler problem.
Hides Irrelevant Details
We had two ways of computing the square of a number x
.
We can use x * x
or we can use x ** 2
.
By abstracting such computation into a function, we do not have to care about how we compute the square of a number.
If by some chance, we found a faster way to compute things, we simply have to change the implementation inside the function without changing how the function is invoked.
In one of the exercise, we have a way to compute the value of \(\text{sin}(x)\) up to a given precision. This was done using the technique from Taylor expansion of \(\text{sin}(x)\). If we improve our mathematics and come up with a smarter way to compute sine, we can again replace the implementation. Alternatively, this can be done by people much smarter than us who have done the mathematics. We simply use their technique in our code.
Faster Sine
For a limite precision, we can use the following approximation.
To hide the implementation details, we need to first separate between specification and implementation. As long as have a complete specification, we do not have to care about the implementation.
Makes Fixing Errors Easier
The act of fixing errors is called debugging. We will illustrate this with the incorrect implementation of hypotenuse. The question is, where is the bug and which version is easier to find?
-
Hypotenuse v1
-
Hypotenuse v2
Here, the mistake is in the squaring.
Instead of x + x
, we should have done x * x
.
For "Hypotenuse v1", we can look at each function independently and decide if the function is correct.
The correctness can be based on the name of the function or other forms of specification.
If we look at the function one by one, we will realize that the functions are all correct except for sqr
.
Additionally, the fix is simpler.
In v1, we simply update the function sqr
and we do not have to change all the other function.
On the other hand, in v2, we have to make two replacements.
x + x
tox * x
.y + y
toy * y
.
This is in line with our motto of "assign once, use multiple times". Obviously, in this case, we replace assign with define. As our programming problems become more complicated, we hope that you will follow the idea of abstraction. We will illustrate problem solving techniques that uses abstraction as the basis.