Scoping

Learning Objectives

At the end of this sub-unit, students should

appreciate the benefit of scoping of variables.
understand lexical scoping of variables.
know how to resolve variables to values based on scoping.
know how to evaluate functions with access to global.

Naming Conflict

Without function, we have to use conventions to resolve name conflicts. In particular, the variables that are deemed local to a sequence of code should not be used outside of this code. That leads to a lot of complication as our code gets bigger. Imagine not being able to name a variable student_name because the same name has been used in other unrelated parts of the code.

The way we represent our function is a black box where we do not care about the name of the variables local to the function. How does this work? To answer this, we need to understand how variables are created.

There is no variable declaration in Python. But based on the rule of assignment, a variable is created if the name does not exist. Unfortunately, this rule is incomplete. To make the rule complete, we need to know where a variable is located.

In Python, an assignment to a variable¹ will create the variable within the scope of the function. This creation is done even before the assignment is executed. We call this the local scope as it is local to the function. Alternatively, if a variable is assigned outside of any function, we call that the global scope. Global scope is available to all functions but local scope is only available to the function.

This may create a rather confusing code if we are not careful. So let us illustrate this with several examples. The first set of example is to show that we can access variables in the global scope.

Global Variable #1Global Variable #2Global Variable #3

n = 3
def f():
  print(n)

f()    # prints 3

Here, n is created in the global scope. The function f can access global scope so it will print 3.

def f():
  print(n)
n = 3

f()    # prints 3

Similar to before, but note that defining f does not evaluate the body. So as long as n is declared before f is invoked, there is no error.

def f():
  print(n)  # source of error

f()    # NameError
n = 3  # - not executed

1	`NameError`

In this case, n is only declared after the function f is invoked. So when we really need the value of n at Line 2, it is not yet available.

The second set of examples is to show that assignment inside the function will create a new variable in the local scope. If this variable has the same name as another variable in the global scope, then the global variable is shadowed by the local variable. When we refer to this name, we will be accessing the local version.

Local Variable #1Local Variable #2Local Variable #3

n = 3       # global n
def f():
  n = 99    # local n
  print(n)  # refers to local n at Line 3

print(n)  # 3 is printed
f()       # 99 is printed by Line 4
print(n)  # 3 is printed

3
99
3

At Line 3, we create a new variable that is available only locally within f. So the global n is unchanged even by the assignment n = 99 at Line 3. This is seen from Line 8 where 3 is still printed.

n = 3       # global n
def f():
  print(n)  # refers to local n at Line 4
  n = 99    # local n

print(n)  # 3 is printed
f()       # UnboundLocalError
print(n)  # - not executed

1 2	`3 UnboundLocalError`

This is a confusing behavior because print(n) appears before n = 99. So we might expect print(n) to refer to global version. Instead, it still refers to the local version.

n = 3       # global n
def f(n):
  n = 99    # updates n from Line 2
  print(n)  # refers to local n declared at Line 2 (updated at Line 3)

print(n)  # 3 is printed
f(n)      # ⇝ f(3) 99 is printed by Line 4
print(n)  # 3 is printed

3
99
3

Since call by value behaves as if the arguments is assigned to the parameter. We can think of it as if there is an assignment that creates the local variable with the same name as the parameter. So parameter is treated like local variable.

Think of it like a "summarizing" process. When we define a function, we analyze the function such that we know which variables are local to the variable. So if we look at the example in "Local Variable #2", we can summarize this as follows.

def f():    # local: n
  print(n)
  n = 99

>>> f()
UnboundLocalError

From this summary, it is hopefully clear that print(n) refers to the local n regardless of whether there is a variable n declared globally or not. Here we have variable n declared locally but without any value. The value should be treated as unbounded instead of nothing because we represent nothing with None. However, the value is not even None, it is simply unbounded. Hence, we get UnboundLocalError if we try to get the value of the variable via substitution².

The illustration with toilet roll is shown below. An unbounded variable does not even have an empty toilet roll. But at least there is still the place to eventually hold one. An undeclared variable does not even have a hypothetical space. This corresponds to a NameError.

Roll

Local and Global State

With function, we now have two different scopes. One scope is available globally and the other locally within each function. The same name can exist in both scope so we need a different way to represent the state of the program at a particular line in the code. We need a state that can capture both global and local scope.

The actual state memory model of Python is much more complicated than what we are going to use here. But the complication is the result of function closure due to higher-order function. As we are not going to go into the details of higher-order function, this representation will be sufficient for our purpose.

We will now represent our state as a "chain" of scopes.

λ:{ name1 ↦ value1 , name2 ↦ value2 , ... } ⟶ γ:{ name3 ↦ value3 , name4 ↦ value4 , ... }

We use lambda (i.e., λ) to represent local scope and we use gamma (i.e., γ) to represent global scope³. If the context is unclear, we may also add the function name as a subscript (e.g., λ_factorial(5)) to indicate that we are referring to the local scope of factorial when invoked with argument with the value of 5. This last part is quite important because a scope will only be used when we are actually executing the function body.

So now, we can explain the behavior of global and local variables above in greater details. The highlighted parts are function definition. We highlighted them because the function is not executed. Instead, we treat it as a whole and add a mapping from the name to the function definition. Also note the use the symbol n ↦ ∅ to represent that n is inside the current scope but currently unbounded. In particular, note that if we try to substitute the name n with the value ∅, we get UnboundLocalError.

The executions for global variables are shown below. Because some execution produces error, we will stop the execution at that point.

Global Variable #1Global Variable #2Global Variable #3

# γ:{ }
n = 3
# γ:{ n ↦ 3 }
def f():
  print(n)

# γ:{ n ↦ 3 , f ↦ <def f> }
f()  #  ⟾
​
​
​
​
#  ⟽ None  (has no effect without assignment)

​
​
​
​
​
​
​
​#  ⟾ f()
def f():
  # λ:{ } ⟶ γ:{ n ↦ 3 , f ↦ <def f> }
  print(n)  # ⇝  no "n" in λ
            # ⇝  print(99)  (from γ)
  #  ⟽ None  (because there is no return)

# γ:{ }
def f():
  print(n)
# γ:{ f ↦ <def f> }
n = 3

# γ:{ n ↦ 3 , f ↦ <def f> }  (order is irrelevant)
f()  #  ⟾
​
​
​
​
#  ⟽ None  (has no effect without assignment)

​
​
​
​
​
​
​
​#  ⟾ f()
def f():
  # λ:{ } ⟶ γ:{ n ↦ 3 , f ↦ <def f> }
  print(n)  # ⇝  no "n" in λ
            # ⇝  print(99)  (from γ)
  #  ⟽ None  (because there is no return)

# γ:{ }
def f():
  print(n)
# γ:{ f ↦ <def f> }

# γ:{ f ↦ <def f> }
f()  #  ⟾
​
​
​
​
​
​
# - execution stops

​
​
​
​
​
​
​
​#  ⟾ f()
def f():
  # λ:{ } ⟶ γ:{ f ↦ <def f> }
  print(n)  # ⇝  no "n" in λ
            # ⇝  no "n" in γ
            # ⇝  NameError
  # - execution stops

The executions for global variables are shown below. We leave "Local Variable #3" as an exercise for the reader.

Local Variable #1Local Variable #2

# γ:{ }
n = 3
# γ:{ n ↦ 3 }
def f():
  n = 99
  print(n)

# γ:{ n ↦ 3 , f ↦ <def f> }
print(n)  # ⇝  print(3)
# γ:{ n ↦ 3 , f ↦ <def f> }
f()  #  ⟾
​
​
​
​
​
#  ⟽ None  (has no effect without assignment)
# γ:{ n ↦ 3 , f ↦ <def f> }
print(n)  # ⇝  print(3)
# γ:{ n ↦ 3 , f ↦ <def f> }

​
​
​
​
​
​
​
​
​
​
#  ⟾ f()
def f():
  # λ:{ n ↦ ∅ } ⟶ γ:{ n ↦ 3 , f ↦ <def f> }
  n = 99
  # λ:{ n ↦ 99 } ⟶ γ:{ n ↦ 3 , f ↦ <def f> }
  print(n)  # ⇝  print(99)
  #  ⟽ None  (because there is no return)
​
​
​

# γ:{ }
n = 3
# γ:{ n ↦ 3 }
def f():
  print(n)
  n = 99

# γ:{ n ↦ 3 , f ↦ <def f> }
print(n)  # ⇝  print(3)
# γ:{ n ↦ 3 , f ↦ <def f> }
f()  #  ⟾
​
​
​
​
# - execution stops

​
​
​
​
​
​
​
​
​
​
#  ⟾ f()
def f():
  # λ:{ n ↦ ∅ } ⟶ γ:{ f ↦ <def f> , n ↦ 3 }
  print(n)  # ⇝  print(∅)
            # ⇝  UnboundLocalError
  # - execution stops

Observe from the execution of local variables, we have variable n in both the local and global scope. Any changes made to the local n will not affect the global n. This is good for making a self-contained function. But it is not good if we want our function to modify global values.

The concensus on the best practice is that we want our function to be self-contained. So making a self-contained function should be easier than the opposite. This is why the behavior is as shown above. Another reason is because there is no explicit variable declaration. So a convention have to be adopted and the convention adopted is the one that makes writing self-contained function easier.

Bad Practice

We mentioned that finding all the locals is like a "summarizing" process. In fact, we have shown that even if the assignment is never executed, the variable is still considered inside the function.

So in the example below, f will not cause produce an error but g will cause an error. In both cases, the execution exits the function immediately when it encounters return n. This means you need to be really really careful with the indentation.

def f():   # local: {}
  return n
n = 4      # global: {n}

>>> f()
4

def f():   # local: {n}
  return n
  n = 4

>>> f()
UnboundLocalError

Function-Level Scoping

A little bit of clarification is needed about the local scope. We mentioned that local scope is created for each function. So this means that variable declared within another block (e.g., inside if-statement or while-loop) will not create a new variable that exists only within that block. That is why our if-statement and while-loop can work.

We may have taken this for granted earlier, but there are languages where scoping is per block instead of per function. Usually, languages with block-level scoping have an explicit variable declaration. There is even at least one language called JavaScript that allows for both block-level scoping and function-level scoping.

Python Function-Level Scoping

Python uses function-level scoping. In particular, if there is an assignment to a variable inside the function regardless of whether the assignment is executed (or even if it ever will be executed), the variable is in the local scope of the funciton.

Lexical Scoping

In Python, the scope of the variable depends only on the code. In particular, it depends only on the location of the variable in the code. More specifically, where the assignment is located.

If the assignment is located outisde of any function, we say that the variable is in the global scope. Otherwise, it must be within some function. In that case, the variable is in the local scope of the function.

That is merely a rephrasing of what we have said before. So let us focus on what is not said instead. As this is difficult, we will guide you through this. What the definition is not saying is that the scope --and hence the existence-- of a variable does not depend on the previous function. Consider the following code.

n = 3
def f():
  m = 2
  return g(n + m)
def g(n):
  return n + m

>>> f()
NameError

We can evaluate the code above and arrive at the following trace that produces a NameError. To simplify our state, we will use λf instead of the longer <def f>. This is like mathematics, we will invent simpler notations when necessary. Do not be afraid of notation, it is a powerful tool to have. You can invent your own notation for your own work when necessary but be sure to use the common notation when answering questions.

# γ:{ }
n = 3
# γ:{ n ↦ 3 }
def f():
  m = 2
  return g(n + m)
# γ:{ n ↦ 3 , f ↦ λf }
def g(n):
  return n + m

# γ:{ n ↦ 3 , f ↦ λf ,
#     g ↦ λg }
f()  #  ⟾
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​

​
​
​
​
​
​
​
​
​
​
​
#  ⟾ f()
def f():
  # λf:{ m ↦ ∅ } ↴
  # γ:{ n ↦ 3 , f ↦ λf ,
  #     g ↦ λg }
  m = 2
  # λf:{ m ↦ 2 } ↴
  # γ:{ n ↦ 3 , f ↦ λf ,
  #     g ↦ λg }
  return g(n + m)
  # ⇝ return g(3 + 2)
  # ⇝ return g(5)  ⟾
​
​
​
​
​
​
​

​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
#  5 ⟾ g(5)
def g(n):
  # λg:{ n ↦ 5 } ↴
  # γ:{ n ↦ 3 , f ↦ λf ,
  #     g ↦ λg }
  return n + m
  # ⇝ return g(5 + m)
  # ⇝ NameError

In this execution, we arrive at NameError because we follow the execution from global to λf and then to λg. This can be quite troublesome, especially for so little benefit as a NameError. Even worse, if there is a loop, then we have to go through the loop until completion.

What we want is a simpler analysis that allows us to quickly determine if a variable exists or not. That way, maybe we can check for error quicker. This analysis can be thought of as the inverse of scoping rule above. Given a variable, can we determine which assignment produces it?

This is the summarizing procedure we had before. Putting it in context, we can visually represent the functions as box of scope with local variables. The actual values will be written in the place of the underscore (i.e., _). As a summary, this is sufficient. But we will typically deal only with the functions that are not finished executing yet. After a function is completed, we can safely remove its box.

Scope01

As you are summarizing this, note the scope for each function is used directly and not prepended to the front of the scope chain. This is why we have the following series of scopes.

γ:{ n ↦ 3 , f ↦ λf , g ↦ λg }
λf:{ m ↦ 2 } ⟶ γ:{ n ↦ 3 , f ↦ λf , g ↦ λg }
λg:{ n ↦ 5 } ⟶ γ:{ n ↦ 3 , f ↦ λf , g ↦ λg }

Here, the arrow ⟶ corresponds to the scope directly enclosing the current scope. Also note that the third is important because if we were merely prepending the scope, we would have gotten the following instead.

Incorrect Scoping

λg:{ n ↦ 5 } ⟶ λf:{ m ↦ 2 } ⟶ γ:{ n ↦ 3 , f ↦ λf , g ↦ λg }

This distinction is important because if the incorrect scoping is used, we would not have gotten the NameError because we will have the mapping m ↦ 2. A lot of things can be learnt even from an error⁴. Try to do this kind of reasoning to fully understand the behavior of Python. At some point, the amount of explanation we can give is insufficient due to the sheer amount of interactions with other constructs.

Global

If we really really want to modify the global variable, what can we do? Since every assignment only modifies the current scope by either creating a new variable or modifying its value, how do we modify a variable from outside of the scope? First, we cannot have a variable name shadowing the outer scope. Second, we need to add the keyword global to indicate that a variable is supposed to come from global scope.

n = 3
def f():
  global n
  # cannot have non-global n here
  n = 2

print(n) # prints 3
f()
print(n) # prints 2 modified inside f

Note that this is a bad practice as it makes reasoning about a function more difficult. In general, we want a function to only use all the information available from its parameter. This makes the function behaves as if it is a mathematical function. There is a name for this, it is called pure function.

Call Tree

To fully understand lexical scoping, let us show the behavior on a non-error execution. This will also illustrate the benefit of having functions that prevents clashes in variable names. We will use the following function definitions.

def hypot(x, y):
  return sqrt(sum_sqr(x, y))

def sum_sqr(x, y):
  return sqr(x) + sqr(y)

def sqrt(x):
  return x ** 0.5

def sqr(x):
  return x * x

Let us evaluate hypot(3, 4). We use a small font size as we need to show the full evaluation. Also, we will exclude the function name from λ as the context is clear.

# γ:{ }
hypot(3, 4)  #  ⟾
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
5.0 #  ⟽ 

​
#  (3, 4) ⟾ hypot(3, 4)
def hypot(x, y):
  # λ:{ x ↦ 3 , y ↦ 4 }
  # ↳ γ:{ }
  return sqrt(sum_sqr(x, y))
  # ⇝ return sqrt(
  #      sum_sqr(3, 4)  ⟾
  #    )
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
# ⇝ return sqrt(25)  ⟽ 25
# ⇝ return sqrt(25)  ⟾
​
​
​
​
​
# ⇝ return 5.0  ⟽ 5.0
#  ⟽ 5.0

​
​
​
​
​
​
​
​#  (3, 4) ⟾ sum_sqr(3, 4)
def sum_sqr(x, y):
  # λ:{ x ↦ 3 , y ↦ 4 }
  # ↳ γ:{ }
  return sqr(x) + sqr(y)
  # ⇝ return sqr(3)    ⟾
  #           +
  #           sqr(4)
​
​
​
  # ⇝ return 9  ⟽
  #           +
  #           sqr(4)    ⟾
​
​
​
​
​
  # ⇝ return 9 + 16 ⟽
  # ⇝ return 25
  #  ⟽ 12
  #  25 ⟾ sqrt(25)
  def sqrt(x):
    # λ:{ x ↦ 25 } ⟶ γ:{ }
    return x ** 0.5
    # ⇝ return 25 ** 0.5
    # ⇝ return 5.0
    #  ⟽ 5.0
​

​
​
​
​
​
​
​
​
​
​
​
​
#  3 ⟾ sqr(3)
def sqr(x):
  # λ:{ x ↦ 3 } ⟶ γ:{ }
  return x * x
  # ⇝ return 3 * 3
  # ⇝ return 9
  #  ⟽ 9
​
#  4 ⟾ sqr(4)
def sqr(x):
  # λ:{ x ↦ 4 } ⟶ γ:{ }
  return x * x
  # ⇝ return 4 * 4
  # ⇝ return 16
  #  ⟽ 16
​
​
​
​
​
​
​
​
​
​

At this point, you may be wondering if there is a simpler way to understand the behavior of a function call so that we do not have to through that long steps. There is, but it requires us to know clearly what each function does. If the function only depends on the input parameters and has no other side-effect, then we can treat it like a true black box called a pure function. We should strive to make all our functions this way. If we need more information, we can add more parameters if the problem permits. It is often the case that the problem already specifies the required parameters which cannot be changed.

Assuming that all our functions are pure functions, then we can simply write them like the black box we did before. We put the way the function is invoked inside box to form our call tree. This way, we do not need to put the input on multiple incoming arrows. Additionally, we put the return value as a note on the outgoing dashed arrow. The call tree for the function call hypot(3, 4) above is shown below.

CallTree01

This part will be important later when we have an assignment to a update the content of a mutable element. ↩
Notice how the error is different between local and global scope. If the variable is not declared at all --not even globally-- then we get NameError. But if the variable is declared locally but not yet bounded to any value then we get UnboundLocalError. ↩
If you know your greek, just remember λ → lambda → l → local and γ → gamma → g → global. ↩
There is a name for this kind of scoping mechanism. This is called dynamic scoping as opposed to our lexical scoping which is also called static scoping. ↩