Range
Learning Objectives
At the end of this sub-unit, students should
- understand range.
- know how to work with range.
- know how to use range in for-loop.
Sequence as Abstraction
Previously, we did an abstraction on problems by thinking in terms of high-level operations. We also abstracted a block of code into functions. There is another kind of abstraction called a data abstraction.
The same idea remains. Instead of thinking of a data in its implementation details, we want to think of a data as what we can do with the data. To illustrate, we will now abstract what we meant by a sequence. We know that a sequence is an ordered collection and we already know one sequence called a string. So how do we abstract a sequence?
Sequence
Consider a data type \(T\) with a variable var
of type \(T\).
We can say that \(T\) is a sequence if the following operations can be performed assuming valid input.
- Indexing:
var[index]
.- The return type will be the type of the element.
- Slicing:
var[start : stop : step]
.- The return type will be the same as the current type.
- Iteration: It can be used in a new construct called for-loop.
- Without the
for
keyword, it will be checking if an element is inside the sequence (i.e.,elem in seq
).
- Without the
Viewed in this lens, a string is a sequence because we can do the three operations. We can show that the first two operations can be done on string.
For-Loop
There is another way to perform a loop called the for-loop. For-loop is intended to simplify looping over a sequence.
Syntax
- It must start with the
for
keyword.- The
for
keyword is followed by a variable name. - It is then followed by a sequence (i.e., an expression that evaluates to a sequence).
- This line is terminated by a colon (i.e.,
:
). - The next line is an inner block called the loop body.
- The
Semantics
- Evaluate the sequence
seq
into an iterable collection of values \(\{V_1, V_2, \cdots, V_n\}\)1. - If there is no next element, exit to Step 6.
- Otherwise, there is a next element, assign the next element to
var
. - Evaluate the block
body
with this new value ofvar
. - Go back to Step 2.
- End of loop.
Since we currently only know about string, we will illustrate this with a string.
Consider a string "ABCD"
.
It can be treated as a sequence of values {'A'
, 'B'
, 'C'
, 'D'
}.
So in the first iteration, we will assign 'A'
to var
.
In the second iteration, we will assign 'B'
to var
.
This continues until the last one.
Afterwards, there is no next element and we exit.
Note that the last value of var
is going to be 'D'
as that was the last value in the sequence.
We should point out that we can solve this with while-loop as seen in the other tab.
However, there is a slight difference in the behavior.
In the case of for-loop, elem
has a value that comes from the sequence.
On the other hand, for while-loop, if we try to access seq[i]
after the loop ends, we will get IndexError
.
But beside this slight differences, the behavior during the iteration is practically the same.
So now we have completely shown that string is a sequence as we can perform the three required operations for our definition of a sequence. We will now look at other sequence.
Bad Practice #1
The first bad practice is the update of the sequence in ❬ expression ❭
.
As seen in the semantics above, the evaluation of this ❬ expression ❭
is performed exactly once in Step 1
So once the iterable is created, it may not matter if the underlying variable is re-assigned.
So the following code will not result in an infinite loop.
Bad Practice #2
The second bad practice is the update of the variable ❬ var ❭
in the function body.
As seen in Step 3 of the semantics, the variable is assigned a value at the beginning of the next iteration.
So it does not matter if the value of the variable is modified within the loop as it will not change what the next value is.
This might be easier to see in the case of a string, but it is a common mistake for range
.
As this is a common mistake for range
, we will show a similar example later.
Sequence of Integer
One of the most useful sequence is the sequence of integer.
We can create a sequence of integer using the built-in data type called range
2.
This creates a range
object that contains a sequence of integer.
There are three ways we can use range
.
range(stop)
.range(start, stop)
.range(start, stop, step)
.
However, we can simply understand the 3rd and we can map the first two into the third.
range(stop)
→range(0, stop, 1)
.range(start, stop)
→range(start, stop, 1)
.
So if we can understand the last usage, we can fully undertand range
.
Unlike slicing, the behavior of range
is simpler.
We are constructing the following sequence of values.
Similar to slicing, we will always exclude the value of stop
.
Because of that, there are cases where range(...)
will produce an empty sequence.
- If
step > 0
andstart <= stop
. - If
step < 0
andstart >= stop
.
Python also disallow having step = 0
.
We can see why from the sequence generated, this will be an infinite sequence.
After all, start
will be equal to start + step
and we will never reach stop
.
Given these explanations, we can show some examples of the use of range
.
range is a Sequence
To show that range
is a sequence, we will simply need to show that the other two operations can be done on range
.
You can practice typing for-loop to show the output.
Usefulness of range
The main advantage of range
is readability.
Consider the code to enumerate the values from 0 to n - 1
.
We typically do this to iterate n
times or to iterate over a sequence of n
elements (e.g., a string with n
characters).
The for-loop in general is more readable as the entire behavior of the loop is captured in a single line at the beginning of the loop. On the other hand, the while-loop version requires us to comprehend 3 different parts of the code.
i = 0
to understand that it starts from 0.i = i + 1
to understand the next value is the old value plus 1.i < n
to understand that it ends atn - 1
.
The same information is capture by for i in range(n)
which is equivalent to for i in range(0, n, 1)
.
In fact, for positive value of step
, the following are almost equivalent except for the value of i
after the loop.
They are only almost equivalent because of the common mistake explained below.
-
While-Loop
-
For-Loop
Common Mistake
As mentioned above, there is a common mistake in comprehending the behavior of range
in a for-loop.
We may mistakenly believe that the effect of for i in range(n)
is to increment the value of i
at the end.
This is not the case.
In Step 1 of the semantics, we see that the range
is only evaluated once.
From Step 3 of the semantics, we see that the value of i
is assigned a value that comes from the iterables and does not depend on the old value of i
.
The combined effect is as if we are producing the entire sequence of integer that can be produced by range
.
This sequence cannot be changed once created.
At the beginning of the next iteration, we are simply assigning the next value in this sequence.
Consider the following code.
The incorrect trace will simply increment i
from the already updated value due to i = i + 1
.
However, this is not the case.
The assignment i = i + 1
does not impacy the next value.
This is also why we say that the while-loop version is an almost equivalent.
You can recreate the incorrect trace with the following incorrect conversion to while-loop.
Delayed Computation
The way range
is implemented is rather nice.
In particular, the actual elements of the range
will not be computed unless the actual value is needed.
To see this, we can try to construct a range
with a large number of elements, say 100 billion integers.
Run it, see how fast it is.
Now, let us say we want to actually run the loop.
But we will do nothing (i.e., just pass
).
Run this with smaller number first to see the difference in runtime.
-
An iterable is a weaker form of an ordered collection where we do not need indexing or slicing as long as it can be used in an iteration. The distinction is mainly irrelevant and will only be used to distinguish the result of
map
andfilter
. ↩ -
Some notes may refer to this as built-in function. This is more correctly a data type with
range(...)
being a constructor (i.e., a special function to instantiate the data). ↩