Matrix

Learning Objectives

At the end of this sub-unit, students should

understand a simple mathematical matrix.
know how to work with mathematical matrix.

Working with Matrix

In mathematics, a matrix is a 2-dimensional arrangement of numbers. It is central to mathematics in the fields of linear algebra. We will look only at a simple operations over square matrices.

A square matrix of size \(n\) is an \(n \times n\) arrangement of numbers. We can represent this as a 2-dimensional array of numbers. For simplicity, we will work only with integer. A general \(3 \times 3\) matrix can be represented as follows.

Mathematics

\[ \begin{vmatrix} a & b & c \\ d & e & f \\ g & h & i \end{vmatrix} \]

Code

matrix = [
  [a, b, c],
  [d, e, f],
  [g, h, i]
]

With this definition of a matrix, we can implement several useful operations as simple examples of how to work with 2-dimensional array before we are moving towards a more general table.

Construction

As the first two steps, we want to be able to construct matrices. There are two simple matrices that we can create. The first is a matrix of zeros and the second is called the identity matrix.

Zeros

Our task here is to implement the function zeros(n) that returns a matrix of size \(n \times n\) such that each element is the number 0. Even this simple task may have a surprising pitfalls. Let us be extremely lazy and simply use the repetition operator *. After all, Python provides us with that.

Zeros

def zeros(n):
  return ([[0] * n]) * n

>>> zeros(3)
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]

That looks correct, what is wrong with the code above? To understand the problem, let us try to split the one-line solution into multiple lines by using variables.

def zeros(n):
  row = [0] * n
  res = [row] * n
  return res

>>> m = zeros(3)
>>> m[0][0] = 9
>>> m
[[9, 0, 0], [9, 0, 0], [9, 0, 0]]

Hopefully it is now clear that all the rows are exactly the same row. In other words, all the rows are alias of one another. This means that if we update one of the row, all the rows are updated. This problem is shown in the sample run above. The box-and-arrow diagram is shown below to visualize this problem.

Alias

So we should not be lazy and we should construct a new row in each iteration.

Zeros

def zeros(n):
  res = []
  for i in range(n):
    row = [0] * n
    res.append(row)
  return res

>>> m = zeros(3)
>>> m[0][0] = 9
>>> m
[[9, 0, 0], [0, 0, 0], [0, 0, 0]]

Identity

The identity matrix of size \(n\) is an \(n \times n\) matrix such that the main diagonal¹ is 1 and all other values are 0. Given the function to construct an \(n \times n\) matrix of zeros, we can use this and update the main diagonal accordingly. This is also an example of problem decomposition.

Identity

def identity(n):
  res = zeros(n)
  for i in range(n):
    res[i][i] = 1
  return res

Matrix Operations

Matrix Addition

Consider two general \(n \times n\) matrices called \(A\) and \(B\).

\[ A = \begin{vmatrix} a_{0,0} & a_{0,1} & \cdots & a_{0,n-1} \\ a_{1,0} & a_{1,1} & \cdots & a_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n-1,0} & a_{n-1,1} & \cdots & a_{n-1,n-1} \end{vmatrix} \ \ \ B = \begin{vmatrix} b_{0,0} & b_{0,1} & \cdots & b_{0,n-1} \\ b_{1,0} & b_{1,1} & \cdots & b_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ b_{n-1,0} & b_{n-1,1} & \cdots & b_{n-1,n-1} \end{vmatrix} \]

The operation \(A + B = C\) is defined as follows.

\[ A + B = C = \begin{vmatrix} a_{0,0} + b_{0,0} & a_{0,1} + b_{0,1} & \cdots & a_{0,n-1} + b_{0,n-1} \\ a_{1,0} + b_{1,0} & a_{1,1} + b_{1,1} & \cdots & a_{1,n-1} + b_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n-1,0} + b_{n-1,0} & a_{n-1,1} + b_{n-1,1} & \cdots & a_{n-1,n-1} + b_{n-1,n-1} \end{vmatrix} \]

That is quite a handful. So let us unpack it a little bit. If we look at the resulting element in \(C\) at some index (i.e., \(c_{i,j}\)), we can see that this is simply the sum of the corresponding elements from \(A\) and \(B\) (i.e., \(a_{i,j} + b_{i,j}\)). Writing this as code, we get the following code.

c[i][j] = a[i][j] + b[i][j]

This means, we need to enumerate all the possible indices. There are two variables i and j, so we need two loops.

But how do we get the number of rows and the number of columns? The number of rows in a matrix m is simply the number of elements in m. This can be obtained by len(m). But what is the number of columns? Assuming that there is at least one row, we can get one of the row and ask the number of elements in the row. Because our matrix is a 2-dimensional array and because array has a regular structure, this will always give us the correct value. So to get the number of columns, the code is len(m[0]). As we are only dealing with square matrices, we just need one of those values.

Must We Use the First Row?

It does not have to be the first element. We can also use the last element: len(m[-1]). Any other random element is not guaranteed to exist. We can also get the middle, but that requires computation and we want our code to be as simple as possible for readability.

Matrix Addition

# Assume m1 and m2 have the same size
def add_matrix(m1, m2):
  res = zeros()
  n = len(m1)
  for i in range(n):
    for j in range(n):
      res[i][j] = m1[i][j] + m2[i][j]
  return res

Matrix Multiplication

The definition of matrix multiplication is more complicated. Without giving the reason why it is defined this way, we will simply state the definition. After all, this is not a linear algebra course. We will use the same matrix \(A\) and \(B\) above.

\[ A \cdot B = C = \begin{vmatrix} a_{0,0}b_{0,0} + \cdots + a_{0,n-1}b_{n-1,0} & a_{0,0}b_{0,1} + \cdots + a_{0,n-1}b_{n-1,1} & \cdots & a_{0,0}b_{0,n-1} + \cdots + a_{0,n-1}b_{n-1,n-1} \\ a_{1,0}b_{0,0} + \cdots + a_{1,n-1}b_{n-1,0} & a_{1,0}b_{0,1} + \cdots + a_{1,n-1}b_{n-1,1} & \cdots & a_{1,0}b_{0,n-1} + \cdots + a_{1,n-1}b_{n-1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n-1,0}b_{0,0} + \cdots + a_{n-1,n-1}b_{n-1,0} & a_{n-1,0}b_{0,1} + \cdots + a_{n-1,n-1}b_{n-1,1} & \cdots & a_{n-1,0}b_{0,n-1} + \cdots + a_{n-1,n-1}b_{n-1,n-1} \end{vmatrix} \]

If we look at each element of \(C\), we will see the following formula.

\[ \begin{align} c_{i,j} & = a_{i,0}b_{0,j} + \cdots + a_{i,n-1}b_{n-1,j} \\ & = \sum_{k=0}^{n-1} a_{i,k}b_{k,j} \end{align} \]

Visually, we do a pairwise multiplication of the \(k\)^th row from \(A\) with the \(k\)^th column from \(B\) and sum the result.

Multiply

By using the formula, we can see that for each i and j, we need a loop to compute c[i][j] as follows.

c[i][j] = 0
for k in range(...):
  c[i][j] = c[i][j] + (a[i][k] * b[k][j])

Putting this into the function, we get the following triply nested loop.

Matrix Multiplication

# Assume m1 and m2 have the same size
def mul_matrix(m1, m2):
  res = zeros()
  n = len(m1)
  for i in range(n):
    for j in range(n):
      # c[i][j] is already 0
      for k in range(n):
        c[i][j] = c[i][j] + (a[i][k] * b[k][j])
  return res

Is There a Better Multiplication?

This is beyond the scope of our notes because again, we are not teaching linear algebra. But there is better algorithm called the Strassen Algorithm. Try to implement these on your own to test your programming skill.

Transpose

A matrix transpose is the operation of flipping the matrix over the main diagonal. This is similar to what we did with the trace table to get the signal processing view. It is captured by the following assignment from a matrix m to the transposed version t.

t[i][j] = m[j][i]

The code is quite straightforward so we will make it a little more complicated to show another kind of operations we may want to do with a matrix. This operation is called in-place operation. An in-place operation is an operation that does not require more than a constant number of additional space. Note that a single variable is not one space because if the variable is holding a matrix, then it has \(n \times n\) space.

If we look at the assignment above, we will realize that we need an additional matrix t. So we need to remove this. To do that, we need to change the assignment to a swapping. This can be done in Python rather easily via the following assignment.

m[i][j], m[j][i] = m[j][i],  m[i][j]

With a little coding, we may arrive at the following code that is unfortunately incorrect.

Transpose

def transpose(m):
  n = len(m)
  for i in range(n):
    for j in range(n):
      m[i][j], m[j][i] = m[j][i],  m[i][j]

Why is this wrong? If we look at some values say i = 4 and j = 5, we will notice that we swap m[4][5] with m[5][4]. But at some other iterations in the future, we will have i = 5 and j = 4 and now we swap m[5][4] with m[4][5]. This is the same swap that we have done before! So we get back the same matrix as before.

The correction is to work only with one half of the matrix. The half that we choose --for simplicity-- is the upper triangular half.

Upper

Here we see that the starting value of j should be i. An alternative is to start from i + 1 because if i == j, it is exactly the main diagonal. There is no change to the values on the main diagonal because it is basically the following swap.

m[i][i], m[i][i] = m[i][i],  m[i][i]

Transpose

def transpose(m):
  n = len(m)
  for i in range(n):
    for j in range(i, n):
      m[i][j], m[j][i] = m[j][i],  m[i][j]

The main diagonal is the diagonal from top left to bottom right. ↩