✢ Order of Growth

Learning Objectives

At the end of this sub-unit, students should

appreciate the advantages and disadvantages of multiple representation.

Preliminary

At its core, order of growth is a way to compare the way two functions behave as the input size grows. Take for instance the problem of getting a movie. Nowadays, we usually just stream a movie but in the olden days we have two options.

Order the movie online as a disc (e.g., DVD or Blu-Ray) and have it delivered to our house.
Download the movie.

An expedited delivery period can be 24 hours and a typical download period for a movie is 2 hours. However, in the case of delivery, it does not matter how many movies we ordered, they will simply be bundled together in the same delivery. On the other hand, our download time depends on how many movies we are downloading.

If we take the typical period of 2 hours per movie, we can see that downloading one movie is faster than delivery of one movie. But once we get into more than 12 movies, delivery is faster than downloading. We can represent this as the following graph.

Growth01

So the question related to the order of growth is the following.

If the "volume" increase, how much more "resources" grow?

Volume is simply a measure of size of the input. Resources are typically time or space. In the example of the movie above, we have the following.

Volume: Number of movies.
Resources: Time taken to get the movie.

We can now compare the two above as follows.

Delivery

When the number of movies (i.e., volume) doubles, the time (i.e., resource) remains constant.
Download

When the number of movies (i.e., volume) doubles, the time (i.e., resource) doubles.

In the context of a program, we can also talk about the runtime of the program and the amount of space required to run the program. But note that we are NOT interested in the absolute time/space required. We are only interested in the proportion of growth in time and space as the input grows.

Order of growth is the formalization of this. The formalization goes like this.

Order of Growth

A given function \(f(n)\) has an order of growth of \(O(g(n))\) if there are positive constants \(k\) and \(n_0\) such that \(f(n) \leq k \times g(n)\) for all \(n \geq n_0\).

Hold on, how do we read that in English?

Order of Growth

A function \(f(n)\) is bounded above by the function \(g(n)\) if after some value \(n_0\), the value of \(k \times g(n)\) is always greater than the value of \(f(n)\).

Ok, that is not so much better. Let us illustrate this with a graph.

Growth02

The red line is the function \(f(n)\) that we want to bound above by some function \(g(n)\). To do that, we look at the behavior of \(f(n)\) at very large values. We cannot rely on small values because as we can see, for \(n \leq n_0\), there are some values where \(f(n) > g(n)\). But at values of \(n \geq n_0\), all values of \(f(n)\) are smaller than \(g(n)\). So we can conclude that \(f(n)\) is bounded above by \(g(n)\).

The notation \(f(n) \in O(g(n))\) indicates that \(f(n)\) is bounded above by \(g(n)\). This is called the Big-Oh notation. We can also say that \(f(n)\) grows no faster than \(g(n)\).

Let us make this more concrete with an example. Consider \(f(n) = n^2 + 2n + 1\). What is the upper bound of this function? Consider \(g(n) = n^3\). We claim that \(f(n) \in O(g(n))\). To show this, we can write down the value of \(f(n)\) and \(g(n)\) for some values of \(n\).

\(n\)	\(f(n)\)	\(g(n)\)
1	4	1
2	9	8
4	25	64
8	81	512

So after \(n = 2\) and above, we can see that \(g(n) > f(n)\). Of course But \(n^3\) grows so fast. Is there a function that grows slower but still faster than \(f(n)\)? Believe it or not, we can set \(g(n) = n^2\). This is because we have not used \(k\) yet. Let us construct the table for some value of \(k = 2\). We choose \(k = 2\) because the constant for \(n^2\) in \(f(n)\) is 1.

\(n\)	\(f(n)\)	\(g(n)\)
1	4	2
2	9	8
4	25	32
8	81	128

Again, after \(n = 2\), we have \(g(n) > f(n)\). This \(g(n) = n^2\) is actually the tightest bound. We can do this for other values to show that this is the tightest bound but luckily, people smarter than us have provided us with some rules that we can follow. Given \(f(n)\), we can find the tightest \(g(n)\) as follows.

Identify the dominant terms.
Ignore additive constants.
Ignore multiplicative constants.

Using \(f(n) = n^2 + 2n + 1\) above, we can follow the rule.

Since we can remove additive constants, we can remove \(1\).
The remaining terms are \(n^2\) and \(2n\).
- The dominant term is \(n^2\).
There is no multiplicative constant for \(n^2\).

Therefore, \(f(n) \in O(n^2)\). Try out on the following values to find the tightest bound for polynomials.

\(5n^2 - 3n + 100\)\(100n^3 + 1000000n^2 + 100n\)

Since we can remove additive constants, we can remove \(100\).
The remaining terms are \(5n^2\) and \(-3n\).
- The dominant term is \(5n^2\).
Since we can remove multiplicative constant, we are left with \(n^2\).

\(f(n) \in O(n^2)\)

There is no additive constant.
The remaining terms are \(100n^3\), \(1000000n^2\), and \(100n\).
- The dominant term is \(100n^3\).
Since we can remove multiplicative constant, we are left with \(n^3\).

\(f(n) \in O(n^3)\)

What about other functions? We will show some common terms in increasing order.

Name	Term
Constant	\(1\)
Logarithm	\(\text{log}(n)\)
Linear	\(n\)
Log Linear	\(n \ \text{log}(n)\)
Quadratic	\(n^2\)
\(\vdots\)	\(\vdots\)
Exponential	\(2^n\)
Exponential	\(3^n\)

The terms we ellided are a combination of polynomial and logarithm. We can easily order polynomial by their power. If we use the symbol \(f(n) ≪ g(n)\) to indicate that \(g(n)\) is the more dominant term, we can order polynomial as follows.

\[ n ≪ n^2 ≪ n^3 ≪ n^4 ≪ \cdots \]

Now, because \(\text{log}(n)\) is smaller than \(n\) but larger than \(1\), we can insert \(n^k \ \text{log}(n)\) in between \(n^k\) and \(n^{k+1}\).

On Real Function

So that is the mathematics of it, but we are from computer science. How can we use this to analyze the runtime of a function? Consider the factorial function, reproduced below, what is the order of growth of the function with respect to time?

def factorial(n):
  res, val = 1, 1
  while val <= n:
    res = res * val
    val = val + 1
  return res

First, we need to figure out what we meant by one step of a computation. We have a few candidates but let us say an assignment, comparison, addition, and multiplication is a single step. Of course it is not so in general because it takes multiplication longer to be computed than addition. But we can ignore multiplicative constant. So if multiplication takes 5 times longer than addition, then it is still the same relative order.

So we look at the number of assignment, comparison, addition, and multiplication needed for each n. Let us construct the table.

`n`	\(f(n)\)
1	7
2	10
3	13
4	16

We can generalize this into the following: \(f(n) = 3n + 4\). So we can deduce that the order of growth for factorial is \(O(n)\). Finding order of growth is as simple as counting.

List vs Dictionary

In the comparison between list and dictionary, we provided a graph showing the difference in an average time to search for a particular value. The graph is reproduced below.

Time01

The graph clearly shows the difference between \(O(n)\) and \(O(1)\). We can see why searching in a list is \(O(n)\) by looking at the code used to find a value, also reproduced below.

Ordered

def find_name(students, matric):
  for student in students:
    if get_matric(student) == matric:
      return get_name(student)

We need to loop through \(n\) students at the worst-case. So the order of growth with respect to time is \(O(n)\).

Unordered

def find_name(students, matric):
  idx = 0
  for digit in matric[1:-1]:
    idx = (idx * 10) + int(digit)
    return get_name(students[idx])

Since there are only 7 digits in a matric number, we only need 7 operations regardless of how many students we have.

The same thing translates to finding a value based on a key of a dictionary. This is why we may want to use dictionary as it is faster.

Multiset

If we do the order of growth with respect to time for the different implementation of multisets, we can see that they are indeed different. We encourage you to do the analysis on your own. We provide the order of growth as comments.

List

# O(1)
def make_empty_mset():
  return []

# O(n) where n is the number of
#      elements in mset
def multiplicity_of(elem, mset):
  res = 0
  for item in mset:
    if item == elem:
      res = res + 1
  return res

# O(n) where n is the larger size between
#      mset1 and mset2
def union(mset1, mset2):
  return mset1 + mset2

Dictionary

# O(1)
def make_empty_mset():
  return {}


# O(1)
def multiplicity_of(elem, mset):
  return mset.get(elem, 0)





# O(n) where n is the larger size between
#      mset1 and mset2
def union(mset1, mset2):
  res = {}
  for key in mset1:  # copy all from mset1
    res[key] = mset1.get(key, 0) + \
               mset2.get(key, 0)
  for key in mset2:  # copy all from mset2
    res[key] = mset1.get(key, 0) + \
               mset2.get(key, 0)
  return res

Here we can see that if speed is very important, we should use the dictionary representation of multiset.