Values
Learning Objectives
At the end of this sub-unit, students should
- understand the different basic types in Python.
- know how to write values of different basic types.
- know how to check basic types.
- know how to convert basic types.
Types
Python has many useful built-in data types to simplify your task at solving problems. We will introduce them throughout the course of this notes. But for now, we will start with simple types. The simple built-in types are shown below. We will also call them "basic types"1.
Basic Data Type
Type Name | Python Type | Definition | Examples | Comments |
---|---|---|---|---|
Boolean | bool |
Representation of yes/no | True , False |
Only has two permissible values |
Integer | int |
Representation of whole numbers | -1 , 0 , 1 , 999 |
In Python, integer has an arbitrary precision |
Floating Point | float |
Representation of real number | -1.5 , 0.1 , 0.333 , 2.5666 |
Floating point may have imprecise representation |
String | str |
Representation of textual data | "abc" , "CS1010S" , "10" |
String can be thought of as a sequence of characters |
Additionally, we will also put explanations as comments.
In Python, comments start with the hashtag symbol (i.e., #
) and continues until end of line.
Comments are not part of the code and will be ignored by interpreter.
Their job is to provide explanations for programmers who read the code.
Lastly, we show some bad practices that you should avoid.
Boolean
Boolean is a representation of yes/no or the values being correct/incorrect, valid/invalid, etc.
As there are only two possible values (e.g., yes/no), there are also only two possible values for Python bool
type.
As stated above, programming requires precise instruction. This means that the keywords above have to be written precisely as you see above. In particular, the first letter has to be written in uppercase.
Common Mistakes
The first common mistakes in writing boolean values is to use lower case. The following are incorrect boolean values:
true
false
If you are using IDLE, it is easier to detect such common mistakes.
As these keywords are not recognized, you will typically see no syntax highlighting on them.
In particular, true
and false
are not colored while True
and False
are colored.
We try to match the color with what you see on IDLE.
Integer
Integer is a representation of whole numbers.
The values can be zero, positive, or negative.
Similar to mathematics, negative numbers are prefixed with a minus sign (i.e., -
).
Otherwise, the value is treated as a positive number.
Do note that minus sign is actually an operator.
This means that you need to be careful when using minus sign.
For instance, --1
is a positive integer 1
.
Terminology
Term | Meaning |
---|---|
Positive | Greater than 0 |
Zero | Equal to 0 |
Negative | Smaller than 0 |
Unlike other programming languages, Python integers can have an arbitrary precision.
In other words, it can be arbitrarily large or small.
Additionally, to write large numbers, Python allows us to separate the digits by the underscore sign (i.e., _
).
There are some limitations that you need to be careful of so use it responsibly.
We recommend grouping the digits 3-by-3 from right-to-left.
Integer Values
Bad Practices
There are some limitations to the way integers can be written in Python. We will provide some erroneous ways integers can be written. For now, you may ignore the error messages and we will go through it in more details in later sub-units.
The following have no errors but are simply bad practice. The reason they are bad practice is because they may make reading your code more difficult. Although codes are written for computers to run, it should also be written for other programmers to read.
While uncommon, you can also use the plus symbol (i.e., +
) to indicate that the number is supposed to be treated as non-negative.
However, as the default is to treat numbers as non-negative, this is simply a redundancy.
Hence, we classify this as bad practice as it will also make your code more difficult to read.
Floating Point
Floating point is a representation of real numbers.
The name comes from the fact that the decimal point (i.e., the dot .
) can be in different places (i.e., floating).
To differentiate between integer and floating point numbers, we have to include a dot when writing floating point numbers.
Similar to integers, we can also construct negative floating point numbers by prepending minus sign to the front. The same warning as before still applies here as well.
Unfortunately, some real numbers cannot be represented using a fixed number of digits.
For example, the number 1/3 is represented as 0.33333... with infinitely repeating 3.
Similar problem occurs in Python float
type.
But due to the different way they represent numbers2, we have different sets of numbers that cannot be represented.
This means that float
is only an approximation of real numbers.
Floating Point Values
Alternative Representation
If you have tried to write some very small numbers like 0.00000000001, you will notice that the values printed back is different: 1e-11
.
This value that is printed back is actually a different way to write floating point numbers.
This way of writing floating point numbers is actually close to the scientific notation.
In scientific notation, instead of writing 0.003102, we can write \(3.102 \times 10^{-3}\) and instead of writing 87.43, we write \(8.743 \times 10^{1}\).
You probably notice the lack of \(\times\) symbol on your keyboard.
To replace that, Python use 8.743e1
.
This shows the mapping, instead of \(x \times 10^{y}\), we can write floating point number as \(x\)e
\(y\).
You do not have to know this alternative representation, but it may help when writing
Bad Practices
Note that floating point must have a decimal point. However, we may omit numbers before or after the decimal point.
Unfortunately, we cannot omit both.
String
String is a representation of textual data.
Texts may involves not only numerical symbol (i.e., 0-9) but also alphabets which may be uppercase (i.e., A-Z) and lowercase (i.e., a-z)3.
Moreover, there are other symbols that can be included for textual data (e.g., dot [.
], minus [-
], plus [+
], space [], etc) but the three above are going to be our most common classification of textual symbol.
Terminology
Term | Symbols |
---|---|
Numeric | 0 - 9 |
Uppercase | A - Z |
Lowercase | a - z |
Alphabet | A - Z, a - z |
Alphanumeric | 0 - 9, A - Z, a - z |
To differentiate between string with purely numerical symbols and integers, we need to put the symbols that are to be treated as string within an enclosure called quotation. The content of the string must be fully enclosed by matching enclosure.
Unfortunately, in Python, there are multiple ways to enclose a string. We will discuss only four of them here. In the table below, the content of our string will be CS1010S.
Quotation
Quotation | Symbol | Example | Comment |
---|---|---|---|
Single-quote | ' 4 |
'CS1010S' |
Can only be single line, cannot have ' inside the string |
Double-quote | " |
"CS1010S" |
Can only be single line, cannot have " inside the string |
Triple single-quote | ''' |
'''CS1010S''' |
Can be multi line, cannot have ''' inside the string |
Triple double-quote | """ |
"""CS1010S""" |
Can be multi line, cannot have """ inside the string |
The different ways we can write string is actually an advantage.
This way, we can actually write a string containing "
as long as we enclose it with '
.
If we need both, then we still have some other alternatives.
We will describe another way to do this later.
For now, you should familiarize yourself with how a string is written.
Single Line String Values
Note that by default, Python prefers to print strings using the single-quote variant. When it is impossible to do so, Python will then try to print strings using the double-quote variant.
Multi Line String Values
In multi line strings, you will see another set of symbols from your prompt to indicate that the current line is a continuation of the previous line (i.e., ...
).
Additionally, you will see some special symbols that begins with backslash (i.e., \
).
We will provide explanations for these special symbols later.
For now, just note that the symbol containing two characters \n
is to be treated as a single character representing a newline.
Note that if you are using IDLE, you will not see these escaped character in a different color. We show them in a different color to illustrate their importance and ease reading them.
In the multi line string values examples above, you are introduced to the special character involving backslash. As stated above, these two character symbols are to be treated as a single character. The reason is because certain characters cannot be shown easily on your screen especially as a single line.
One of which is the newline. This character can be given as input from your keyboard using the Enter button. To actually show newline character on a single line is to use these special characters. We name the backslash character as an escape character.
There are other characters that can be escaped.
Some of the useful ones are escaping \'
(and \"
) to include the character '
(respectively, "
) inside a single-quoted (respectively, double-quoted) string.
The table below shows the list of common characters that can be escaped.
Escaped Characters
Symbol | Meaning |
---|---|
\n |
Newline character (i.e., Enter) |
\t |
Tab character (i.e., Tab) |
\' |
Single-quote (i.e., ') |
\" |
Double-quote (i.e., ") |
\\ |
Backslash (i.e., escape the escape character) |
The above escape characters are not exhaustive.
The single-quote and double-quote escape characters are important if you have both in one string (e.g., "The Answer is \"Corleone's Corner\""
).
Bad Practices
Remember that a string must be enclosed by matching quotation mark. Unmatched quotation mark will be treated as an error.
- You need to use single-quote quotation mark.
Checks and Conversion
The column on Python type that we introduced earlier in this sub-unit also has a special meaning. It is the name of the data type, the way to recognize the type, as well as a way to convert other types to the given type. Unfortunately, as you will see later, not all types can be converted to one another.
To actually get the data type from a value, there is a special function that you can use namely the type
function.
For now, we are not going to create our own function.
We will simply use functions.
To use a function, we will need to add parentheses (i.e., round bracket ()
5) after the function name.
Inside the parentheses, we can give values to be operated on by the function.
For simplicity, we call these values as arguments.
Checking Types
Given that, to actually check the type of a value, we need to pass the value as arguments to the type
function.
What you need to focus on is the presence of bool
, int
, float
, and str
in the output.
Ignore the other parts, namely the <class '...'>
.
The concept is called a class and we will have a (potentially optional) sub-unit on that.
Since the names are also a way to convert other types into the given types, we can also treat the name as if they are a function6.
In other words, we can pass in a value \(x\) into bool
, int
, float
, and str
to try to convert \(x\) into the respective type (i.e., boolean, integer, floating point, and string respectively).
The basic idea of a conversion is that the value must be a valid value as described in the above section. If the arguments are not a valid value of the given type, we will get an error. An exception to this is conversion from floating point to integer in which there is an attempt to convert the floating point value to an approximate integer value following a rule.
Conversion to Boolean
In short, conversion to boolean value checks if the value is empty or not.
By empty, we look at the given type and consider the equivalent of what we meant by empty.
For instance, an empty number are 0
and 0.0
.
An empty string is simply ''
or ""
.
Obviously, converting a boolean to boolean will simply give us the value back.
Conversion to Integer
For floating point number, conversion to integer will always remove the numbers after the decimal point.
As you can see above, 1.99
becomes 1
instead of the usual rounding.
Similarly, for -2.99
, it is also truncated to become -2
.
As for string, it has to be a valid textual representation of integer as described in the previous section.
Conversion to Floating Point
- Unfortunately, on our online REPL, they do not show the
.0
at the end.
Conversion from integer to floating point is the simplest.
Since all integer can be converted into a real number by adding .0
at the end, that is simply what Python will do.
There is, however, a slight unfortunate difference between our online REPL and IDLE.
In IDLE, you will see the .0
printed but in our REPL, you will not see it printed.
Always follow the output from IDLE.
Conversion to String
This is the easiest conversion.
As string may contains alphanumeric characters, if we are converting from other types, we will simply enclose them within a quotation mark.
The usual quotation mark is the single-quote (i.e., '
).
Note that for conversion from integer, we are not going to convert it to floating point first, but will simply enclose it within a quotation mark.
Boolean and Integer
Notice how the conversion from integer to boolean collapses many integers into a single boolean value. This is because there are only 2 possible boolean values. As such, there is not enough value to be mapped into. On the other hand, when converting from boolean to integer, we only need to use 2 integer values because there are only 2 boolean values. This conversion can be captured by the following mapping from two sets.
Rounding
There are different rounding methods. We will discuss some that might be relevant. At the very least, we have seen one way of rounding above called truncation. However, this is simply one of many possible rounding possibilities. Other rounding methods are explained below.
Rounding | Description |
---|---|
Round Down | Find the largest integer that is smaller than or equal to the given value. This is also known as floor. |
Round Up | Find the smallest integer that is larger than or equal to the given value. This is also known as ceiling. |
Round to Zero | Remove the numer after the decimal point. This is also known as truncation. |
We can visualize the different rounding methods using the diagram below.
- Green Line: The blue line at the top is the floor operation.
- Blue Line: The blue line in the middle is the truncation operation.
- Red Line: The red line at the bottom is the ceiling operation.
Bad Practices
Remember how we say that computers do not understand our intention? In the case of type conversion, Python does not understand English (or other human languages). Additionally, it does not understand that we are trying to evaluate an expression when written as a string.
Another common problem is that when we chain the conversion, we may get counter-intuitive (but explainable) result. Note first that we can chain function call by giving the result of one function as argument to another function. We will discuss more of chaining in future sub-units.
For instance, we can write int(bool(0))
to get 0
.
Examples below are the list of counter-intuitive result that you should try to avoid.
In fact, for readability, do not chain too many functions if possible.
Arbitrary Precision Integer
Recap that integer has an arbitrary precision but floating point is merely an approximation of real number. Which means, there is a finite precision on floating point. At some point, there should be an integer that is larger than any floating point values.
This value is rather large and for most practical uses, you should not encounter this value. If you are wondering, the value is shown below assuming we are only using the digit 9. However, you should put it on a single line.
If you try to convert this into a float, you will get an error.
Review
Question 1
What is the type of the value 1010
?
Question 2
What is the type of the value 99.0
?
Question 3
What is the type of the value "1010"
?
Question 4
What is the type of the value 1010S
?
Question 5
Select ALL python types.
Question 6
Select ALL the quotation marks from this section.
-
Some authors may call them primitive type instead. ↩
-
In computers, we represent numbers as base-2 instead of base-10. This means that instead of 0-9, we only have two symbols, 0 and 1. You do not have to know the details of this, just remember that floating point is only an approximation of real number. ↩
-
Hopefully, the way to read these "range" is natural. In the case of 0-9, it simply means one of 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. For A-Z and a-z, we have to look only at "consecutive" characters. The convention is that A and a are in different group. ↩
-
Not to be consufed with backtick (i.e., `) which is the symbol you can typically find to the right of symbol 1 on your keyboard. ↩
-
There are different kinds of brackets, so we have to be more precise here. The other kinds of brackets are square brackets (i.e.,
[]
) and curly brackets (i.e., braces{}
). ↩ -
They are not exactly a function but they are a class constructor. ↩