List
Learning Objectives
At the end of this sub-unit, students should
- understand list.
- understand the concept of mutability.
- know how to work with list using basic operations.
Immutability
Python data types that we have encountered so far are what we call as immutable data type. What it means is that once the data is created, the content cannot be modified.
A few things need to be made clear.
Consider the tuple tpl = (1, 2)
.
By the data, we meant the tuple (1, 2)
.
By modification, it means that we cannot simply look at the box containing the value 1
and change the content of the box into 9
.
What we can do is create a new tuple (9, 2)
and assign this to the variable tpl
as shown below.
We can explain this behavior much better with box-and-arrow diagram.
Notice how the old tuple is still there.
A new tuple is created.
The variable tpl
is then set to point to the new tuple.
Since there is no more arrows pointing to the old tuple, the memory is reclaimed by Python so that it can be used for other objects.
This is similar to how a ballroom is cleared after an event is completed so that it can be used for another event.
So what do we really mean by modifying the content? To modify only the content, we want to be able to do the following operation below. Unfortunately, we will get an error for tuple.
Similarly for string, we will also get the same problem.
Given the different usage of "modify", we will use two different terminology to avoid confusion. We will look at the general assignment statement below.
In the case that lhs
is a variable, we say that it is an assignment.
On the other hand, if lhs
is not a variable but an expression that evaluates to something like collection[value]
, we say that we are updating the collection
at a particular value
.
For instance, in the case of tpl[0] = 9
, we will say the following.
We are trying to update the tuple
tpl
at index0
to the value9
.
Of course, in this case, we cannot do that because tuple
is immutable.
What we will learn next is the first mutable data type called list
.
List
A list is a mutable sequence of anything.
Unlike tuple, a list is enclosed by a square bracket (i.e., []
).
List
The syntax for list is as follows.
Unlike tuple, we do not need to add trailing comma for singleton list. This is because there are fewer usage of square brackets (e.g., indexing and slicing) so there is no ambiguity in the use for a computer.
- Empty Tuple:
[]
- Singleton:
[ expr ]
(the comma at the end is not required, but possible) - General:
[ expr1, expr2, expr3 ]
(can have as many expressions as needed)
The simple cases is similar to tuple.
To make things short, we simply state that the sequence operations will still work for list.
The list of operations that have similar behavior as tuple is listed below.
We will assume that the variable lst
holds a list.
List Operations
Operation | Description | Example |
---|---|---|
Indexing | Returns the element at the given index | lst[1] |
Slicing | Returns a new list from the original list | lst[0:3:2] |
Iteration | Iterates the list in a for-loop | for elem in lst: |
Append | Concatenates two list and return the new list | lst + [2, 3] |
Repetition | Repeats a list for \(n\) times and return a new list | lst * n |
No Mix Type
Although both list and tuple are sequence of anything, they are different types. One important rule to note is that we cannot concatenate a list with a tuple or vice versa. Both of the following will cause errors.
Update Operation
What sets a list apart from a tuple is the idea of mutability. Recap that we cannot update an element in the tuple, we can only create a new tuple. A list allows us to update the value inside the box without creating new list.
The box-and-arrow diagram explanation is shown below.
Note that the box for each element in the list is an open box because we can set a new value into the box.
In our convention, an open box indicates a mutable object while a closed box indicates an immutable object.
To put it simply, an open box can be used as lhs
in the assignment statement lhs = rhs
.
Notice how there is no new list created. Since the box is open, we can simply put in new value inside this open box.
How Do We Know?
But how do we know that there really is no new list created? After all, we cannot see the box-and-arrow diagram when running the program. Actually, we can. If we can stop make the execution to run step by step and if we can see the data inside the memory in between, we will see that the layout in the memory will look like that when the addresses are being connected.
Of course we will not do that.
There is another way.
It involves the use of a built-in function called id
.
This function can be used to determine the identity of any object.
Typically, the identity of an object is the address where the object is located in memory.
So if two object have the same identity, there is no new box created.
Let us look at what happen for tuple.
The output of the id
function will be different for each run because the initial address is randomized.
The next is what happen for list.
Aliasing Problem
We first encounter the concept of aliasing in tuple. But because tuple is immutable, there is no discernible effect of aliasing. Unfortunately, when aliasing is combined with mutability, a lot of confusion may ensue. To see the problem, we will start with a simple aliasing of variables.
What is happening?
How come the value of lst1
is changed even if we did not update it?
We are only updating lst2
, but because lst1
and lst2
are aliased, the update made to lst2
will affect lst1
.
This is better explained with box-and-arrow diagram.
Notice how the both lst1
and lst2
are pointing to the same list.
At the last tab, we are updating lst2[0]
.
This is indicated by the red path.
However, this same box can also be reached from the arrow pointing out from lst1
.
Hence, even if the change is made to lst2[0]
, we can see the effect from lst1
because of the arrow pointing to the same location.
That is the problem caused by aliasing.
So what can we do to avoid aliasing? One possible attempt is to copy the list.
Let us see the effect on the previous problem.
Sadly, this will not help us for that long. Life is not always that simple.
While the variable lst1
and lst2
themselves are not aliased, the expression lst1[0]
and lst2[0]
are aliased as seen from the box-and-arrow diagram below.
So instead of fighting against aliasing, we need to embrace aliasing. After all, if we do not want aliasing to be a problem, we always have tuple. Why do we need to learn list? That is what we are going to discuss next.