Tuple
Learning Objectives
At the end of this sub-unit, students should
- understand tuple.
- know how to work with tuple.
- know how to use tuple in for-loop.
Sequence of Anything
A string is a sequence of character where each character is a string. A range1 is a sequence of only integer. Is there a sequence of anything?
A sequence of anything is called a tuple with the data type in Python written as tuple
.
It is a data type with a built-in support from Python.
To create a tuple, we can enclose values with parentheses.
If there are multiple values, we separate it with a comma.
Tuple
The syntax for tuple is as follows.
To avoid an overly complicated syntax2, we will simply explain the few cases of interest.
- Empty Tuple:
()
- Singleton:
( expr, )
(singleton is a tuple with single value, the comma at the end is required) - General #1:
( expr1, expr2, expr3 )
(can have as many expressions as needed) - General #2:
( expr1, expr2, expr3, )
(can have trailing comma)
Why Different Singleton?
The problem with parentheses is that it is used in many different places. We will list some of the places it is used. Understanding the context where parentheses occurs is a prerequisite to understanding tuple.
- Parentheses to disambiguate order of operation:
(expr1 + expr2) * expr3
.- There must be no comma inside the outer parentheses.
- There may be comma inside the
expr1
orexpr2
.
- Parentheses to invoke functions:
function(arg1, arg2)
.- The name
function
must be a function.
- The name
- Parentheses for tuple:
(expr1, expr2)
.- There must not be name before the parentheses.
- Note that
(expr)
cannot create a tuple because that is already the same as case 1.- Hence, we use
(expr,)
to differentiate the two cases.
- Hence, we use
Given that, let us give some concrete example of a tuple.
Tuple
Bad Practice
Although we can have trailing comma, we cannot have a blank expression. This is usually indicated by a presence of two consecutive commas, except when it is for the first element. So the following examples causes error.
Anything?
We say that a tuple is a sequence of anything, but what is anything? We can actually put any values inside a tuple. These values do not have to be of one type either. So we can have a tuple containing both a string and an integer as shown below.
But that is not yet anything. What is the latest value we have learnt? That right, we can also create a tuple within a tuple. This is a glimpse of a nested structure we will explore in the future. A possible use is a matrix which is a 2-dimensional structure.
Box-and-Arrow Diagram
Before we look at the operations that can be performed on a tuple, it will be easier to explain them if we can visually represent a tuple.
The representation we choose is as follows.
Consider the tuple tpl = (1, 2, 3)
first.
Let us break down the meaning of the box-and-arrow diagram a little here.
On the leftmost, we have the variable x
.
This is represented by an open box because we can modify the value of x
via an assignment (e.g., tpl = (3, 2, 1)
).
So the value can be changed.
It then has an arrow to a sequence of three elements.
Whenever the value inside the box is an arrow, we need to indicate where it starts.
So we put the symbol ✱
as the origin of this arrow.
The arrow and the symbol ✱
have another meaning.
The actual value inside the box is the address of the tuple.
But remember that we are abstracting the address away.
The address is managed by the computer so we do not care where it is located in memory.
We are only interested in the fact that the value should be whatever the address of that tuple is.
So we also abstract this idea away to simply connect the two location via an arrow.
In short, when we put ✱
inside the box, that is the actual value in the box.
This will have an impact on function call.
Now, the positive index is shown at the top for clarity, but it is not necessary.
The element at index 0 is 1, the element at index 1 is 2, and the element at index 2 is 3.
This corresponds to the expression (1, 2, 3)
.
Notice how the box is closed. This is because we cannot re-assign the value once it is created. We choose the term re-assign carefully as modify is ambiguous. There are different ways of modification without potentially re-assigning values. But for now, we have no ways of doing that.
Hopefully that is simple enough if we ignore the open/close box convention which will only be useful later.
What about nested tuple?
Consider the tuple tpl = (('CS', 1010, 'S'), ('is', 'easy'))
.
The box-and-arrow diagram is shown below.
Note the difference.
The first element of x
is the entire expression ('CS', 1010, 'S')
(shown at the bottom to avoid crossing arrows).
So the first element itself is a tuple!
Similarly, the second element is also a tuple, but a different one from the expression ('is', 'easy')
.
Box-and-Arrow for String
You may realize something weird here, why don't we represent a string such as 'CS'
with a sequence of boxes too?
The reason is simple, the weird expression 'X'[0][0][0][0]
being evaluated to 'X'
necessitates the need for a loop from the element to itself.
Yet at the same time, the value inside the box should be the ✱
to indicate that it has an arrow but also the character 'X'
.
This becomes more and more like string theory with the superposition of values and the string forming a loop.
So we choose to simplify the representation of a string with the entire value written inside the box.
The indexes are left as an exercise to the reader.
Sequence Operations
With the help of the box-and-arrow diagram, we can figure out most of the operations.
Indexing
The first operation we are going to discuss is indexing.
We will use the example of the nested tuple (i.e., tpl = (('CS', 1010, 'S'), ('is', 'easy'))
).
Say we want to get the first element.
How can we do it?
As before, we simply write tpl[0]
.
What is the result?
Well, the first element is a tuple, so we have this entire tuple ('CS', 1010, 'S')
.
Okay, that is easy.
But now note that since tpl[0]
is a tuple, we can also ask what is the second element of this tuple.
If the resulting tuple is in variable y
, then we can get the second element as y[1]
.
However, because y
is in fact tpl[0]
, we should be able to write it as tpl[0][1]
.
And indeed, that is the case.
This will get more complicated with more nesting.
Luckily, we have the box-and-arrow diagram.
At the very least, we will assume that we can draw the box-and-arrow diagram correctly.
If not, we need to practice drawing it correctly.
But let us assume we can.
Then we can describe the operation of seq[idx]
as simply the following assuming that there is no error.
Indexing
- Evaluate
seq
by finding the variable calledseq
.- This variable should have the value of
✱
.
- This variable should have the value of
- Follow the arrow from the value
✱
inside the box.- This should lead to a sequence where
idx
is a valid index.
- This should lead to a sequence where
- Retrieve the value at
idx
.- If
idx
is negative, use negative index.
- If
Then we can explain tpl[0][1]
as first evaluating the leftmost operation tpl[0]
before evaluating the result with the remaining [1]
.
Each one will be evaluated based on the indexing procedure above.
We will highlight the full step below in tabs.
Follow the tabs to see the progressing.
With this procedure, we can evaluate any indexing as long as we can draw the box-and-arrow diagram.
Length
There is another advantage of the box-and-arrow diagram.
Using the diagram, the answer to len(x)
is quite obvious.
In this case, it really is just the number of element in the tuple represented by x
, which is 2.
We do not look further inside to count the endpoints.
This corresponds nicely to the idea from before that for a sequence of n
elements, the valid indexes are shown below.
- Positive:
0
ton - 1
- Negative:
-1
to-n
Connecting back to the nested tuple above, tpl[2]
(and higher) as well as tpl[-3]
(and lower) will give us IndexError
.
Slicing
As for slicing, there is actually no change to the procedure. Also note that we will be constructing another tuple as the result. The original tuple will not be modified in any way. Our procedure simply states the following.
... we start from the element at the index determined by
start
... until we reach the element determined bystop
.
So as long as we can do indexing, we can do slicing. But it is still instructive to show the behavior since we have nesting. Let us expand the tuple into the following.
Notice that the element at stop
is the entire string 'easy'
.
Therefore we exclude this but we will include the element before this, which is a tuple ('very', 'very')
.
This tuple is copied as it is.
The box-and-arrow diagram before and after the slicing is shown below.
The tuple created by slicing (i.e., tpl2
) is a new tuple as seen from a new sequence of boxes.
Also, the value of 'is'
and 'easy'
are copied.
Can we also say the same thing for the value of tpl1[2]
?
If we say that the value is whatever is inside the box, this is exactly the arrow starting from ✱
.
We are --in fact-- copying this arrow.
This has a very important implication.
The actual tuple ('very', 'very')
is not duplicated.
The two arrows (i.e., one from tpl1[2]
and the other from tpl2[1]
) are pointing to the same location.
This is called aliasing and will be the source of many problems in the future.
For now, this will not really cause us problems but it is good to know potential problems in advanced.
Aliasing
An aliasing happens when we have two different arrows pointing to the same location.
This means that the data is shared between the two location.
If we have two variables v1
and v2
pointing to the same location, then we say that v1
is an alias of v2
(_and vice versa).
Iteration
With all the work we have done, the explanation for iteration is also simple. We will simply indexing the element one by one from left to right. The potential confusion is the same as before, what is an element. But hopefully by now it is clear that an entire tuple can be a single element of another tuple.
Tuple Operation
Similar to string, there are arithmetic-like operations for tuple.
Arithmetic-Like Operations
Operation | Symbol | Example Code | Result |
---|---|---|---|
Repetition | * |
2 * ('CS', 1010, 'S') |
('CS', 1010, 'S', 'CS', 1010, 'S') |
Concatenation | + |
('CS', 1010) + ('S',) |
('CS', 1010, 'S') |
Also similar to string, these will always create a new tuple.
However, the element may be shared (i.e., aliased).
Relational operation on tuple also follows the same procedure as string.
But because tuple can now be nested, the comparison will be also nested in a way.
Our example below uses <
but the same nesting comparison also works for other relational operator like ==
3.
Tuple Relational Operation
There is an additional problem that may arise because a tuple can be a sequence of anything. We may have two elements to be incomparable. This was not a problem in string because the element of a string is somehow always a string. But not so in tuple. This may lead to the following common mistakes.
Common Mistakes
What this also means is that certain built-in functions that requires comparison may not always work.
For instance, the built-in function min
or max
may not work if there are incomparable elements in the tuple.
Recap that there is another relational operator for sequence called the in
operator,
There is also the corresponding not in
operator.
This operator behaves differently from string.
In the case of string, we are looking for a consecutive substring.
Recap of String in Operator
However, for tuple, we are looking only for a specific element and not for consecutive subsequence. This specific element must also be directly in the tuple and not nested deeper inside.
Tuple in Operator