Complex Operations
Learning Objectives
At the end of this sub-unit, students should
- be able to compose simple operations to form complex operations.
- know how to evaluate complex operations in Python.
- know how to write complex operations in Python.
The Need for Order
You may have come across the mathematical problem that took the internet by surprise. It may look simple at first but --surprisingly-- people come to different conclusion about the result. The problem is a simple mathematical formula
People on the internet are split about what the answer should be and there are two main camps:
There is even an image of two different calculators giving different results.
Implicit Multiplication
The real problem with \(6 \div 2(1 + 2)\) is that implicit multiplication (i.e., \(2(1 + 2)\)) is not formally defined. Different calculator will have different conventions. If we are following PEMDAS blindly and treat implicit multiplication similar to explicit multiplication, we get \(6 \div 2 \times (1 + 2)\). Which is how we get \(9\).
On the other hand, many people1 treat implicit multiplication with higher priority than normal multiplication. In this case, the formula is equivalent to \(6 \div (2 \times (1 + 2))\). This is how we get \(1\).
In any case, notice how with the proper rule, there is no ambiguity in the evaluation. The ambiguity lies in which rule to use. Luckily, implicit multiplication is not a valid program so this issue will not occur in Python.
Having different results on two different machine is a problem for programming. Imagine if you are sending your code to your friend and they are not getting the same result as you! That would definitely make open-source software impossible. You also would not want submitting your program as an answer only to find that the behavior is different from your computer.
So we need to have consistency when the same program is run on two different computers. For that, we need conventions that dictates the order of operation. Without order, we only have chaos.
In the case of arithmetic operations, you probably already familiar with the order of operations. Unfortunately, we have more operations than just arithmetic. Order of operations in Python will have to consider all of these different kinds of operations.
Let us start with a terminology.
Order of operations is a convention about which operations to perform first.
If a given operation (e.g., *
) is to be performed before another operation (e.g., +
), we say that the operation to be performed first has higher precedence.
To put in in a sentence, we say that "*
has higher precedence than +
".
PEMDAS
You may have learnt the mnemonic to remember the order of mathematical operations called PEMDAS. Alternative names include BEDMAS or BODMAS but the underlying convention is the same. The name PEMDAS is the collection of the first letter of each operations:
- Parentheses (or bracket)
- Exponentiation
- Multiplication and Division
- Addition and Subtraction
The ordering above corresponds to the way we group the arithmetic operations in the previous sub-unit. Except that we do not have parentheses. Additionally, we have a lot more operations than multiplication and division within that group namely, floor division and remainder. The table is reproduced below with the connection to PEMDAS shown on the first column.
PEMDAS
Group | Operation | Symbol | Associativity |
---|---|---|---|
Parentheses | Parentheses | () |
Leftmost Innermost |
Exponentiation | Exponentiation | ** |
Right to Left |
Multiplication Division |
Multiplication Division Floor Division Remainder |
* / // % |
Left to Right |
Addition Subtraction |
Addition Subtraction |
+ - |
Left to Right |
Look at the leftmost letters from top to bottom and you shall find PEMDAS. Associativity is the order of operations for operations within the same group.
Since Python was created as a mathematical tool, the order of operations closely mimics mathematical order of operations. This is good news for us who are already familiar with PEMDAS.
PEMDAS
So the rule of PEMDAS is if we have different operators with different precedence, we perform the operation with the highest precedence first. It is as if we have a parentheses around the operation with the highest precedence.
What about for operations with the same precedence. This is where associativity comes in. For groups with the same precedence, we perform the operation according to the associativity after resolving all operations with the higher precedence.
Mostly, the associativity is left to right, which is close to how we read latin alphabets in the first place. However, one operation stood out as being different and that is the exponentiation operation. Instead of left to right, the associativity for exponentiation is right to left.
Consider how we can show that this is the correct behavior for exponentiation. Here, scientific method will be your ally. Scientific method is a useful tool in general because Python is already a mature language with many capabilities. There are only a limited subset of Python that we can teach, even with this extended notes. As such, we will also show how you can try to deduce the behavior of things we may have missed in our explanations.
First, we start with a conjecture. Our conjecture is that "exponentiation has right to left associativity". We need to understand what it really means. For right to left associativity (also called right associative operation), the order of operation is from right to left when considering operations from the same group.
Since we consider only the exponentiation group, we consider the operation x ** y ** z
where we leave x
, y
, and z
unspecified for now.
We have two possibilities of the order of operations:
-
Right to Left Associativity
x ** (y ** z)
-
Left to Right Associativity
(x ** y) ** z
To differentiate and show that our conjecture is correct, we need to now specify the values of x
, y
, and z
such that the two operations above produce different results.
It is important that they produce different result because otherwise we cannot determine which one is correct.
This is what is often called as falsifiability.
A simple trial and error should give us x = 2
, y = 1
, and z = 2
.
Plugging in the values above, we get:
-
Right to Left Associativity
2 ** (1 ** 2)
=2 ** 1
=2
-
Left to Right Associativity
(2 ** 1) ** 2
=2 ** 2
=4
So now we can test if 2 ** 1 ** 2
will produce 2
(i.e., right to left associativity) or 4
(i.e., left to right associativity).
The sample execution below shows that our conjecture is correct that it is indeed correct that exponentiation is right associative.
Associativity
Once we found our hypothesis is correct, it is useful to conclude our findings with either a summary or a potential explanation. We can look at mathematics to consider why exponentiation has right to left associativity. In mathematics, we often write nested exponetiation as \(x^{y^{z}}\). You may not be able to see it clearly but there is \(z\) at the top.
Unfortunately, when writing code, we are not capable of writing values as a superscript. The effect of flattening \(x^{y^{z}}\) is that we have to write it as \(x \otimes y \otimes z\) where \(x \otimes y\) is defined as \(x^y\). To recover the original superscript, we then have to perform the evaluation from right to left.
It should be noted that PEMDAS also works for arithmetic-like operation in string.
So we can say that the order of operation is really about the symbol rather than the meaning of the operation itself.
This is related to how programs are "read" by computer.
Your computer does not care if the value is a string or an integer, if it sees the +
operator, it will have lower precedence than *
operator.
Order of Operation for Arithmetic-Like Operation
Scientific Method
When faced with new concept, it is good to employe the scientific method. It is an empirical method for acquiring knowledge. We will omit the full scientific method and will show the simplified steps.
- We start with a question (e.g., what is the associativity of exponentiation?).
- Formulate a hypothesis (e.g., left to right or right to left). Your hypothesis should include falsifiable tests.
- Experiment. One of the advantage we have is that we can experiment directly on Python IDLE.
- If the experiment shows positive result, we may want to summarize our result using a conclusion.
- Otherwise, we go back to the step (1) and possible reformulate another hypothesis.
One point to note is that even if the experiment shows positive result, we may want to probe further. Alternatively, we should come up with several falsifiable tests during the hypothesis step.
Difference from PEMDAS
One major difference with PEMDAS that you may encounter easily is the very high precedence of exponentiation operator (i.e., **
).
Exponentiation operator is higher than even the unary minus operator used to indicate negative number.
As such, you may find the following result surprising.
This also highlights that whitespace may not matter when determining the order of operation.
Disjunctive Normal Form
PEMDAS works for arithmetic operations and there is no grouping for relational operation2.
The remaining operations are the logical operations for which we will simply state that the order of operation is shown below.
You have to be careful that or
has the lowest precedence and will be resolved last.
Disjunctive Normal Form
Group | Operation | Symbol | Associativity |
---|---|---|---|
Negation | Negation | not |
Right to Left |
Conjunction | Conjunction | and |
Left to Right |
Disjunction | Disjunction | or |
Left to Right |
The name disjunctive normal form comes from the study of logic. You do not have to worry about the name as long as you remember the order of operation. Unless you have studied logic, the name is rather meaningless.
To be honest, the negation operation does not really conform to the typical associativity as there is only one operand to the negation operation. Still, for completeness, we will show that the order of operation is ineed right to left.
Let us practice another scientific method. Our hypothesis now is that the order of operation is as stated in the table above. How can we verify that for conjunction and disjunction.
First, we need an expression that uses both conjunction and disjunction.
So we need three values.
With a little bit of tinkering, you should be able to come up with the following expression True or True and False
.
This expression produces different result depending on the precedence level of conjunction with respect to disjunction.
-
Conjunction before Disjunction
True or True and False
=True or (True and False)
=True or False
=True
-
Disjunction before Conjunction
True or True and False
=(True or True) and False
=True and False
=False
Disjunctive Normal Form
Chained Relation
Optional Knowledge
Relational operations can also be chained.
For instance, we may write 1 < 3 < 5
.
The intention is to simplify the check for range like the example above but we change the middle 3
to some variable.
We will learn about variables in later units.
Given the intention above, we can then generalize this into a series of conjunction.
So 1 < 3 < 5
is equivalent to 1 < 3 and 3 < 5
.
But there is a hidden assumption here that the value 3
is only going to be evaluated exactly once.
This may sound weird as it does not seem to change the behavior above and you are right to say that.
But once we have other construct, we can see the difference.
As with any good intention, it can be abused.
Checking for range is nice but consider 1 != 2 != 1
.
At first glance it may seem like we are asking if all three values are different (i.e., \(x \neq y \neq z\) with \(x = 1\), \(y = 2\), and \(z = 1\)).
But that is incorrect as it would produce False
.
A simple translation to conjunction 1 != 2 and 2 != 1
reveals that the result should be True
.
That is why this part is optional as the behavior is not as simple as you may initially thought and we have no way of showing that complexity. Simply keep in mind that it is possible and use it for the intended purpose.
Combined
So far we have shown the order of operations for arithmetic and logical operations separately. We have also shown that parentheses can be used to force certain ordering that may differ from the convention since parentheses has the highest precedence. But what if we want to combine arithmetic, relational, and logical operations? What should be the order of operation between these kinds of operations?
Before we show the table of the combined order of operation, let us explore some possible ordering. The aim is to show you that Python makes reading codes quite natural.
Consider the code 2 + 3 > 4
.
When faced with such a code, what is the most natural reading of it?
Well, there are two possibilities but notice how one of them does not much sense as we do not know how to evaluate that further.
-
Arithmetic before Relational
2 + 3 > 4
=(2 + 3) > 4
=5 > 4
=True
-
Relational before Arithmetic
2 + 3 > 4
=2 + (3 > 4)
=2 + False
= ???
So it makes more sense that all arithmetic operations have higher precedence than all relational operations. More often than not, this leads to a logical conclusion. Unfortunately, things are not always that simple. We highly recommend reading the common mistakes box for when this "natural" reading failed to guide us.
Continuing this journey, we have placed arithmetic at higher precedence than relational. Now what about logical operations? Here, we will consider the code to check if the number 6 is fully divisible by 2 and 3 as shown below
For the code to be read as such, we need to add the parentheses in the following way
This means that logical operations have the lowest precedence among the three kinds of operations we have discussed. In fact, you can try other ways to put the parentheses and notice that the meaning is not quite as natural.
Combined Order of Operation
So we can conclude that the general ordering of operations when considering different kinds of operations is as follows:
Kind | Ordering within Group |
---|---|
Arithmetic | PEMDAS |
Relational | not explained yet |
Logical | Disjunctive Normal Form |
Common Mistakes
We will show how the "natural" reading of code may lead to unexpected behavior. Let us consider the following problem:
Check if 6 is greater than 2 or 3
Based on a "natural" reading of the problem, we may accidentally come to the conclusion that the code is as shown below.
Well, that is to be expected.
6 is indeed greater than 2 or 3.
So we expect the output to be True
.
But wait, even if the test confirms our hypothesis, it may be because our test is not thorough enough.
If we still have time, we should come up with another example.
Check if 6 is greater than 7 or 3
That is similar enough and we expect the result to still be True
because one of the statement is correct.
6 is greater than 3.
Now that is weird. Why is the result not even a boolean value? What is happening? Could it be that we are missing parentheses? Let us try again with parentheses.
That does not help.
The result should be True
.
The problem has multiple dimensions.
First, we need to consider the order of operation as we have learnt.
This gives us the order as follows: (6 > 7) or 3
.
Now we can evaluate the inside of the parentheses and we should get False or 3
.
How do we evaluate that?
The evaluation of this is part of bad practice in the previous sub-unit about truthy/falsy values.
Since the LHS is False
, the result is simply the RHS.
The main thing to avoid is do not be lazy. Just because an expression looks "natural" does not mean it is always correct. When in doubt, always write in full.
In short, only use the shorthand when you are sure what the behavior is. This shorthand may even be considered a bad practice if you are not using it for checking range.
More on Parentheses
Since parentheses has the highest precedence, it can be used to avoid ambiguity in the order of operations. This is especially useful when we are not sure about the order of operations. Consider the problem of finding the balance \(B\) when we depost some money \(P\) in a bank with interest rate \(r\), number of times the interest is compounded \(n\), and the time \(t\) expressed as the formula below.
You do not have to worry about the formula, just that it is complex. We will use the following simple values:
- \(P\) = 10000
- \(r\) = 0.05
- \(n\) = 12
- \(t\) = 10
Pluggin in the values, we should get 16470.09. First note that we will need parentheses as the following expression clearly gives the wrong result.
If we do not know the minimum number of parentheses needed, what can we do? The easiest is to write parentheses everywhere. This is especially useful if we are the one writing the code. So if we can spend the time to write the parentheses (or if our typing speed is fast), then we can force the order of operation to match what we understand from the formula by adding parentheses.
So do not be afraid of losing time by writing more parentheses. It will take longer to figure out why the code without parentheses does not work and to add parentheses compared to simply just writing parentheses from the start.
Minimum Parentheses
In case you are interested, the minimum number of parentheses needed is the following:
Parenthesesizing
If we do not care about the number of parentheses in an expression as long as we can understand the order of operation, we can always add parentheses on the code to help us read it. Consider the PEMDAS convention, we can either
- add parentheses starting from the highest precedence (i.e., higher position on the table), or
- replace an operator \(\oplus\) with \())) \oplus (((\) where the number of \()\) and \((\) depends on the precedence
Let us reproduce the PEMDAS table below.
Group | Operation | Symbol | Associativity | Parenthesesizing |
---|---|---|---|---|
Parentheses | Parentheses | () |
Leftmost Innermost | - |
Exponentiation | Exponentiation | ** |
Right to Left | ) ** ( |
Multiplication Division |
Multiplication Division Floor Division Remainder |
* / // % |
Left to Right | )) * (( )) / (( )) // (( )) % (( |
Addition Subtraction |
Addition Subtraction |
+ - |
Left to Right | ))) + ((( ))) - ((( |
You are probably confused about the strategy, especially the second one.
We will illustrate both strategies with 1 + 2 * 3 ** 4 // 5
.
1 + 2 * 3 ** 4 // 5
= 1 + 2 * (3 ** 4) // 5
**
has highest precedence
= 1 + (2 * (3 ** 4)) // 5
*
and //
has same precedence, so left first
= 1 + ((2 * (3 ** 4)) // 5)
*
and //
has same precedence, so right next
= (1 + ((2 * (3 ** 4)) // 5))
*
and +
has lowest precedence
= 1 + (2 * (3 ** 4) // 5)
simplify, if needed
1 + 2 * 3 ** 4 // 5
= 1 + 2 * 3 ) ** ( 4 // 5
**
is replaced with ) ** (
= 1 + 2 )) * (( 3 ) ** ( 4 )) // (( 5
*
and //
is replaced with )) * ((
and )) // ((
= 1 ))) + ((( 2 )) * (( 3 ) ** ( 4 )) // (( 5
+
is replaced with ) + (
= ((( 1 ))) + ((( 2 )) * (( 3 ) ** ( 4 )) // (( 5 )))
balance the outermost parentheses
= 1 + (2 * (3 ** 4) // 5)
simplify, if needed
Notice how both arrived at the same expression after simplification.
You may choose either strategy to help you evaluate an expression, no matter how complex it is.
But we only show how to do it for PEMDAS.
It should be straightforward how to extend this for combined expressions.
Simply follow the overall order of operations, especially for how many )))
and (((
to add.
Summary
Summary of Order of Operations
The precedence is specified from high to low. The higher the value, the higher the precedence. Alternative notation is to use priority in which the smaller the value, the higher the priority.
Symbol | Numeric | String | Boolean | Precedence |
---|---|---|---|---|
** |
Exponentiation | - | - | 6 |
* / // % |
Multiplication Division Floor Division Remainder |
Repetition - - - |
- - - - |
5 |
+ - |
Addition Subtraction |
Concatenation - |
- - |
4 |
not |
- | - | Negation | 3 |
and |
- | - | Conjunction | 2 |
or |
- | - | Disjunction | 1 |
Note that this is not the complete set of operations on Python.
There are other operations we did not explain such as the bitwise operations.
You may find a more complete documentation online.
Additionally, we omit some unary operations such as +
and -
(e.g., +3
and -5
as we consider them part of values).