Skip to content

List

Learning Objectives

At the end of this sub-unit, students should

  • understand list.
  • understand the concept of mutability.
  • know how to work with list using basic operations.

Immutability

Python data types that we have encountered so far are what we call as immutable data type. What it means is that once the data is created, the content cannot be modified.

A few things need to be made clear. Consider the tuple tpl = (1, 2). By the data, we meant the tuple (1, 2). By modification, it means that we cannot simply look at the box containing the value 1 and change the content of the box into 9. What we can do is create a new tuple (9, 2) and assign this to the variable tpl as shown below.

1
2
3
4
tpl = (1, 2)
print(tpl)
tpl = (9, 2)
print(tpl)
(1, 2)
(9, 2)

We can explain this behavior much better with box-and-arrow diagram.

Tuple01

Tuple02

Tuple03

Notice how the old tuple is still there. A new tuple is created. The variable tpl is then set to point to the new tuple. Since there is no more arrows pointing to the old tuple, the memory is reclaimed by Python so that it can be used for other objects. This is similar to how a ballroom is cleared after an event is completed so that it can be used for another event.

So what do we really mean by modifying the content? To modify only the content, we want to be able to do the following operation below. Unfortunately, we will get an error for tuple.

tpl = (1, 2)
tpl[0] = 9 # only modify the first element
TypeError: 'tuple' object does not support item assignment

Similarly for string, we will also get the same problem.

s = "CS1010E"
s[-1] = 'S'
TypeError: 'str' object does not support item assignment

Given the different usage of "modify", we will use two different terminology to avoid confusion. We will look at the general assignment statement below.

lhs = rhs

In the case that lhs is a variable, we say that it is an assignment. On the other hand, if lhs is not a variable but an expression that evaluates to something like collection[value], we say that we are updating the collection at a particular value. For instance, in the case of tpl[0] = 9, we will say the following.

We are trying to update the tuple tpl at index 0 to the value 9.

Of course, in this case, we cannot do that because tuple is immutable. What we will learn next is the first mutable data type called list.

List

A list is a mutable sequence of anything. Unlike tuple, a list is enclosed by a square bracket (i.e., []).

List

The syntax for list is as follows.

[ expression ,  expression ,  expression , ...,  expression ]

Unlike tuple, we do not need to add trailing comma for singleton list. This is because there are fewer usage of square brackets (e.g., indexing and slicing) so there is no ambiguity in the use for a computer.

  • Empty Tuple: []
  • Singleton: [ expr ]      (the comma at the end is not required, but possible)
  • General: [ expr1, expr2, expr3 ]      (can have as many expressions as needed)

The simple cases is similar to tuple.

List

1
2
3
4
5
6
>>> []   # empty list
[]
>>> [1]  # singleton
[1]
>>> [1, 2, 3]
[1, 2, 3]

To make things short, we simply state that the sequence operations will still work for list. The list of operations that have similar behavior as tuple is listed below. We will assume that the variable lst holds a list.

List Operations

Operation Description Example
Indexing Returns the element at the given index lst[1]
Slicing Returns a new list from the original list lst[0:3:2]
Iteration Iterates the list in a for-loop for elem in lst:
Append Concatenates two list and return the new list lst + [2, 3]
Repetition Repeats a list for \(n\) times and return a new list lst * n

No Mix Type

Although both list and tuple are sequence of anything, they are different types. One important rule to note is that we cannot concatenate a list with a tuple or vice versa. Both of the following will cause errors.

1
2
3
4
>>> [1, 2] + (3, 4)
TypeError: can only concatenate list (not "tuple") to list
>>> (1, 2) + [3, 4]
TypeError: can only concatenate tuple (not "list") to tuple

Update Operation

What sets a list apart from a tuple is the idea of mutability. Recap that we cannot update an element in the tuple, we can only create a new tuple. A list allows us to update the value inside the box without creating new list.

List Mutability

1
2
3
4
5
6
>>> lst = [1, 2, 3]
>>> lst
[1, 2, 3]
>>> lst[0] = 9
>>> lst
[9, 2, 3]

The box-and-arrow diagram explanation is shown below. Note that the box for each element in the list is an open box because we can set a new value into the box. In our convention, an open box indicates a mutable object while a closed box indicates an immutable object. To put it simply, an open box can be used as lhs in the assignment statement lhs = rhs.

List01

List02

Notice how there is no new list created. Since the box is open, we can simply put in new value inside this open box.

How Do We Know?

But how do we know that there really is no new list created? After all, we cannot see the box-and-arrow diagram when running the program. Actually, we can. If we can stop make the execution to run step by step and if we can see the data inside the memory in between, we will see that the layout in the memory will look like that when the addresses are being connected.

Of course we will not do that. There is another way. It involves the use of a built-in function called id. This function can be used to determine the identity of any object. Typically, the identity of an object is the address where the object is located in memory. So if two object have the same identity, there is no new box created.

Let us look at what happen for tuple. The output of the id function will be different for each run because the initial address is randomized.

>>> tpl = (1, 2, 3)
>>> tpl
(1, 2, 3)
>>> id(tpl)
4347996736
>>> tpl = (9, 2, 3)
>>> tpl
(9, 2, 3)
>>> id(tpl)  # different from before
4347989760

The next is what happen for list.

>>> lst = [1, 2, 3]
>>> lst
[1, 2, 3]
>>> id(lst)
4348199616
>>> lst[0] = 9
>>> lst
[9, 2, 3]
>>> id(lst)  # same as before
4348199616
>>> lst = [9, 2, 3]
>>> lst
[9, 2, 3]
>>> id(lst)  # now it is different
4347998656

Aliasing Problem

We first encounter the concept of aliasing in tuple. But because tuple is immutable, there is no discernible effect of aliasing. Unfortunately, when aliasing is combined with mutability, a lot of confusion may ensue. To see the problem, we will start with a simple aliasing of variables.

1
2
3
4
5
6
7
8
9
>>> lst1 = [1, 2, 3]
>>> lst1
[1, 2, 3]
>>> lst2 = lst1   # lst1 and lst2 are aliased
>>> lst2[0] = 9   # we update lst2, not touching lst1
>>> lst1          #   and see what lst1 looks like
[9, 2, 3]
>>> lst2
[9, 2, 3]

What is happening? How come the value of lst1 is changed even if we did not update it? We are only updating lst2, but because lst1 and lst2 are aliased, the update made to lst2 will affect lst1. This is better explained with box-and-arrow diagram. Notice how the both lst1 and lst2 are pointing to the same list.

List03B

List03B

List03C

At the last tab, we are updating lst2[0]. This is indicated by the red path. However, this same box can also be reached from the arrow pointing out from lst1. Hence, even if the change is made to lst2[0], we can see the effect from lst1 because of the arrow pointing to the same location. That is the problem caused by aliasing.

So what can we do to avoid aliasing? One possible attempt is to copy the list.

1
2
3
4
5
def copy(lst):
  res = []
  for elem in lst:
    res = res + [elem]
  return res

Let us see the effect on the previous problem.

1
2
3
4
5
6
7
8
9
>>> lst1 = [1, 2, 3]
>>> lst1
[1, 2, 3]
>>> lst2 = copy(lst1)   # no longer aliased
>>> lst2[0] = 9         # we update lst2, not touching lst1
>>> lst1                #   and see what lst1 looks like
[1, 2, 3]
>>> lst2
[9, 2, 3]

Sadly, this will not help us for that long. Life is not always that simple.

>>> lst = [1, 2, 3]
>>> lst1 = [lst]
>>> lst1
[[1, 2, 3]]
>>> lst2 = copy(lst1)
>>> lst2[0][0] = 9
>>> lst1
>>> lst1
[[9, 2, 3]]
>>> lst2
[[9, 2, 3]]

While the variable lst1 and lst2 themselves are not aliased, the expression lst1[0] and lst2[0] are aliased as seen from the box-and-arrow diagram below.

List04A

List04B

List04C

List04D

So instead of fighting against aliasing, we need to embrace aliasing. After all, if we do not want aliasing to be a problem, we always have tuple. Why do we need to learn list? That is what we are going to discuss next.