Skip to content

Range

Learning Objectives

At the end of this sub-unit, students should

  • understand range.
  • know how to work with range.
  • know how to use range in for-loop.

Sequence as Abstraction

Previously, we did an abstraction on problems by thinking in terms of high-level operations. We also abstracted a block of code into functions. There is another kind of abstraction called a data abstraction.

The same idea remains. Instead of thinking of a data in its implementation details, we want to think of a data as what we can do with the data. To illustrate, we will now abstract what we meant by a sequence. We know that a sequence is an ordered collection and we already know one sequence called a string. So how do we abstract a sequence?

Sequence

Consider a data type \(T\) with a variable var of type \(T\). We can say that \(T\) is a sequence if the following operations can be performed assuming valid input.

  • Indexing: var[index].
    • The return type will be the type of the element.
  • Slicing: var[start : stop : step].
    • The return type will be the same as the current type.
  • Iteration: It can be used in a new construct called for-loop.
    • Without the for keyword, it will be checking if an element is inside the sequence (i.e., elem in seq).

Viewed in this lens, a string is a sequence because we can do the three operations. We can show that the first two operations can be done on string.

1
2
3
4
5
>>> s = 'CS1010S'
>>> s[3]
'0'
>>> s[1:5:2]
'S0'

For-Loop

There is another way to perform a loop called the for-loop. For-loop is intended to simplify looping over a sequence.

Syntax

for  var  in  expression  :
   block 
  • It must start with the for keyword.
    • The for keyword is followed by a variable name.
    • It is then followed by a sequence (i.e., an expression that evaluates to a sequence).
    • This line is terminated by a colon (i.e., :).
    • The next line is an inner block called the loop body.

Semantics

for var in seq:
  body
  1. Evaluate the sequence seq into an iterable collection of values \(\{V_1, V_2, \cdots, V_n\}\)1.
  2. If there is no next element, exit to Step 6.
  3. Otherwise, there is a next element, assign the next element to var.
  4. Evaluate the block body with this new value of var.
  5. Go back to Step 2.
  6. End of loop.

Since we currently only know about string, we will illustrate this with a string. Consider a string "ABCD". It can be treated as a sequence of values {'A', 'B', 'C', 'D'}. So in the first iteration, we will assign 'A' to var. In the second iteration, we will assign 'B' to var. This continues until the last one. Afterwards, there is no next element and we exit. Note that the last value of var is going to be 'D' as that was the last value in the sequence.

>>> seq = 'ABCD'
>>> for elem in seq:
...   print(elem)
...
A
B
C
D
>>> elem
'D'
>>> seq = 'ABCD'
>>> i = 0
>>> while i < len(seq):
...   print(seq[i])
...   i = i + 1
...
A
B
C
D

We should point out that we can solve this with while-loop as seen in the other tab. However, there is a slight difference in the behavior. In the case of for-loop, elem has a value that comes from the sequence. On the other hand, for while-loop, if we try to access seq[i] after the loop ends, we will get IndexError. But beside this slight differences, the behavior during the iteration is practically the same.

So now we have completely shown that string is a sequence as we can perform the three required operations for our definition of a sequence. We will now look at other sequence.

Bad Practice #1

The first bad practice is the update of the sequence in expression . As seen in the semantics above, the evaluation of this expression is performed exactly once in Step 1 So once the iterable is created, it may not matter if the underlying variable is re-assigned. So the following code will not result in an infinite loop.

1
2
3
4
text = 'abcd'
for char in text:
  text = text + char
print(text)
abcdabcd

Bad Practice #2

The second bad practice is the update of the variable var in the function body. As seen in Step 3 of the semantics, the variable is assigned a value at the beginning of the next iteration. So it does not matter if the value of the variable is modified within the loop as it will not change what the next value is. This might be easier to see in the case of a string, but it is a common mistake for range.

1
2
3
4
5
text = 'abcd'
for char in text:
  char = 'x'
  text = text + char
print(text)
abcdxxx

As this is a common mistake for range, we will show a similar example later.

Sequence of Integer

One of the most useful sequence is the sequence of integer. We can create a sequence of integer using the built-in data type called range2. This creates a range object that contains a sequence of integer.

There are three ways we can use range.

  1. range(stop).
  2. range(start, stop).
  3. range(start, stop, step).

However, we can simply understand the 3rd and we can map the first two into the third.

  1. range(stop)range(0, stop, 1).
  2. range(start, stop)range(start, stop, 1).

So if we can understand the last usage, we can fully undertand range. Unlike slicing, the behavior of range is simpler. We are constructing the following sequence of values.

{start, start+step, start+step+step, ...}

Similar to slicing, we will always exclude the value of stop. Because of that, there are cases where range(...) will produce an empty sequence.

  • If step > 0 and start <= stop.
  • If step < 0 and start >= stop.

Python also disallow having step = 0. We can see why from the sequence generated, this will be an infinite sequence. After all, start will be equal to start + step and we will never reach stop.

Given these explanations, we can show some examples of the use of range.

1
2
3
range(5)   # {0, 1, 2, 3, 4}
range(7)   # {0, 1, 2, 3, 4, 5, 6}
range(-1)  # { }
1
2
3
4
5
6
7
8
>>> for val in range(5):
...   print(val)
...
0
1
2
3
4
1
2
3
range(2, 5)   # {2, 3, 4}
range(-1, 7)  # {-1, 0, 1, 2, 3, 4, 5, 6}
range(3, 2)   # { }
1
2
3
4
5
6
7
8
>>> for val in range(-2, 3):
...   print(val)
...
-2
-1
0
1
2
1
2
3
range(2, 5, 2)    # {2, 4}
range(7, -1, -3)  # {7, 4, 1}
range(2, 3, -1)   # { }
1
2
3
4
5
6
7
8
>>> for val in range(7, -2, -2):
...   print(val)
...
7
5
3
1
-1

range is a Sequence

To show that range is a sequence, we will simply need to show that the other two operations can be done on range. You can practice typing for-loop to show the output.

1
2
3
rng = range(-5, 5)  # {-5, -4, -3, -2, -1, 0, 1, 2, 3, 4}
print(rng[3])       # prints -2
rng = rng[1:8:2]    # {-4, -2, 0, 2}

Usefulness of range

The main advantage of range is readability. Consider the code to enumerate the values from 0 to n - 1. We typically do this to iterate n times or to iterate over a sequence of n elements (e.g., a string with n characters).

  • While-Loop


    1
    2
    3
    4
    i = 0
    while i < n:
      # code
      i = i + 1
    
  • For-Loop


    for i in range(n):
      # code
    

The for-loop in general is more readable as the entire behavior of the loop is captured in a single line at the beginning of the loop. On the other hand, the while-loop version requires us to comprehend 3 different parts of the code.

  • i = 0 to understand that it starts from 0.
  • i = i + 1 to understand the next value is the old value plus 1.
  • i < n to understand that it ends at n - 1.

The same information is capture by for i in range(n) which is equivalent to for i in range(0, n, 1). In fact, for positive value of step, the following are almost equivalent except for the value of i after the loop. They are only almost equivalent because of the common mistake explained below.

  • While-Loop


    1
    2
    3
    4
    i = start
    while i < stop:
      # code
      i = i + step
    
  • For-Loop


    for i in range(start, stop, step):
      # code
    

Common Mistake

As mentioned above, there is a common mistake in comprehending the behavior of range in a for-loop. We may mistakenly believe that the effect of for i in range(n) is to increment the value of i at the end. This is not the case.

In Step 1 of the semantics, we see that the range is only evaluated once. From Step 3 of the semantics, we see that the value of i is assigned a value that comes from the iterables and does not depend on the old value of i. The combined effect is as if we are producing the entire sequence of integer that can be produced by range. This sequence cannot be changed once created. At the beginning of the next iteration, we are simply assigning the next value in this sequence.

Consider the following code.

1
2
3
for i in range(5):
  i = i + 1
  print(i)
1
2
3
4
5
1
2
3
4
5
1
2
3
1
3
5

The incorrect trace will simply increment i from the already updated value due to i = i + 1. However, this is not the case. The assignment i = i + 1 does not impacy the next value. This is also why we say that the while-loop version is an almost equivalent. You can recreate the incorrect trace with the following incorrect conversion to while-loop.

1
2
3
4
5
i = 0
while i < n:
  # code
  # code
  i = i + 1
1
2
3
4
5
i = 0
while i < n:
  i = i + 1
  print(i)
  i = i + 1
Delayed Computation

The way range is implemented is rather nice. In particular, the actual elements of the range will not be computed unless the actual value is needed. To see this, we can try to construct a range with a large number of elements, say 100 billion integers. Run it, see how fast it is.

range(100_000_000_000)

Now, let us say we want to actually run the loop. But we will do nothing (i.e., just pass). Run this with smaller number first to see the difference in runtime.

for elem in range(100_000_000_000):
  pass

  1. An iterable is a weaker form of an ordered collection where we do not need indexing or slicing as long as it can be used in an iteration. The distinction is mainly irrelevant and will only be used to distinguish the result of map and filter

  2. Some notes may refer to this as built-in function. This is more correctly a data type with range(...) being a constructor (i.e., a special function to instantiate the data).