Multiple Representation
Learning Objectives
At the end of this sub-unit, students should
- appreciate having multiple representation for a single ADT.
- know how to represent an ADT in multiple ways.
Multiset
We know that Python has an implementation of a set. In a set, there are no duplicate values. What if we want to keep duplicate values? Mathematically, this is represented as a multiset. In a multiset, the count of each element --called multiplicity-- matters but the order does not matter.
For example, \(\{a, b, b, a, b\}\) is equal to \(\{b, a, b, a, b\}\) but not equal to \(\{a, b\}\). We can check this by looking at the multiplicity. In both \(\{a, b, b, a, b\}\) and \(\{b, a, b, a, b\}\), the multiplicity of \(a\) is 2 and the multiplicity of \(b\) is 3. So we have the same multiplicity for each element.
On the other hand, the multiplicity of \(a\) is 1 and the multiplicity of \(b\) is 1 for \(\{a, b\}\). Since there is a different multiplicity, this is not equal to \(\{b, a, b, a, b\}\) if we treat this as a multiset. Our task is to implement the following operations for multiset.
make_empty_mset()
: Returns an empty multiset.multiplicity_of(elem, mset)
: Returns the multiplicity of the elementelem
in the multisetmset
. The multiplicity may be 0.union(mset1, mset2)
: Returns the union of the two multisets.
Note that for union, we need to define it carefully. Since the multiplicity is important, the multiplicity of each element \(a\) in the result should be the sum of the multiplicity of \(a\) in both operands.
With that, we can now present some possible representation of a multiset.
List
List is an ordered sequence so the order matter. But if we are using a list to represent a multiset, the order should not matter. We can of course sort the list so that some operations are easier, but we will choose the simplest representation instead.
Now that we know that we want to represent a multiset as a list, we can decide on what the empty multiset is. This should correspond to an empty list because in both cases, there should not be any element.
Now, consider the \(\{a, b, b, a, b\}\) from above. How will this look like as a list? A possible representation is the following.
With this concrete example, we can rephrase the problem of finding the multiplicity of a given element is the problem of counting the number of occurrences of the element in the list. We will solve this with a simple accumulator pattern.
Finally, to perform a union of two multiset, we can simply concatenate the two list!
Putting things together, we get the following representation of multiset as list.
Multiset as List
Dictionary
Can we solve this more easily with a dictionary? That depends on what we meant by easy. If we want to represent this as a dictionary, we immediately know that an empty multiset is really just an empty dictionary.
What about multiplicity? Well, again, we have to try thinking about what \(\{a, b, b, a, b\}\) looks like. It is not quite clear how we can use a dictionary for this. One way is to simply mimic a list. We will use an integer as the key.
This does not look nice. Is there another way? Since the count of each element is important, we can use the element as the key and the count as the value. So \(\{a, b, b, a, b\}\) will be representated as the following.
Now we can see that finding the multiplicity is trivial.
But remember, we need to ensure that there is no error if the multiplicity is 0.
We can use mset.get(key, 0)
to ensure this.
Finally, in the case of union, we need to create a new empty multiset. We then need to copy the key from both multiset. At the same time as we are copying, we need to sum the count from both. In the solution below, we do not care if we assign the value multiple times as long as we are copying from both. This is why we do not need check if the key already exist in the result.
Putting things together, we get the following representation of multiset as dictionary. The main thing to note is that some operations become simpler as a list and some operations become simpler as a dictionary.
Multiset as Dictionary
Matrix
Let us look at some other ADT with a possible multiple representation.
Dense Matrix
A dense matrix representation is implemented using 2-dimensional array. We have such a representation before. The reason why this is called a dense matrix is because we store everything including all the zeros.
What happen if everything is zero? We still need to store everything. Can we do better? Yes, the idea is to not store zeros. This is called a sparse matrix which we will discuss next.
Sparse Matrix
To avoid storing the zero, we need to know the following information.
- The number of rows.
- Previously, we can know this by
len(m)
.
- Previously, we can know this by
- The number of columns.
- Previously, we can know this by
len(m[0])
.
- Previously, we can know this by
- The element at a particular row and column.
- Previously, we can know this by
m[row][col]
assuming zero-based value.
- Previously, we can know this by
In this representation, if the element at a particular row and column is zero, we do not even need to have that in our representation. Consider the following matrix.
We can compare the two representation as follows.
-
Dense Matrix
-
Sparse Matrix
We can see that for this particular case, the sparse matrix seems to be more complicated. But imagine we have 100 rows and 100 columns where we only have 3 non-zero elements. The representation for sparse matrix becomes simpler as the data will only contain 3 values.
We need to talk a little bit more about the data.
Note that the data is represented as a dictionary.
This is quite a special dictionary.
The key in this dictionary is the pair (row, col)
and the value is the value of the matrix at the given row
and col
.
Let us give a different example to clearly show the representation. Consider the following matrix.
-
Dense Matrix
-
Sparse Matrix
We can now write the functions for matrix that we have written before. To illustrate the benefit of this representation, we will show the implementation for zero matrix, identity matrix, and transpose. Try to solve these on your own before looking at the solution below.
Sparse Matrix