Searching
Learning Objectives
At the end of this sub-unit, students should
- understand the idea of linear searching.
- understand the idea of binary search.
Linear Search
Consider an arbitrary list.
How can we know if it contains a certain element?
Well, luckily, in Python, there is an operator for that called the in
operator.
So that is straightforward. What if we want to find the positive index of the element? And if the element does not exist, we say that the index is -1. Since this is part of the negative index, we have not violated the problem specification.
So how can we do this? If there is no assumption about the elements in the list, then there is only thing we can do. We must search all the possible index.
Linear Search
Can we improve on this? Unfortunately not. We can do a little thought experiment. Say we can improve on this by never searching one of the index. It can be any index. What can be a problem?
The problem is if the value is in that index we skipped. So we cannot skip this particular index. Since we made no assumption about which index we skip, it means that we cannot skip any index at all. Since we cannot skip any index, we have to check all index.
This is an example of informal argument we need to do to convince ourselves that our code is correct.
Binary Search
The reason we cannot improve is because we have no other assumption about the list. In the worst-case, the value is always in the last place we look.
Quote
"I lost some time once. It's always in the last place you look for it."
Neil Gaiman
So to improve linear search, we need to make assumptions on the elements in the list. The assumption we want to impose on the list is that the list is sorted. For simplicity, we will say that the list is in ascending order. This implies that there is no two elements are equal. So let us imagine that we are a computer and we try to search for a particular value 65 inside 7 boxes.
Linear search will open all boxes from left to right or in any other order. To improve on this, we assume that the boxes are arranged in ascending order. We do not want to open all boxes. What can we do?
- Open the middle box.
- This middle box is smaller than 65.
- We can exclude all boxes to the left of this since we know that they must be smaller than 65.
- We open the middle of the remaining box.
- This middle box is larger than 65.
- We can exclude all boxes to the right of this since we know that they must be larger than 65.
- We open the middle of the remaining box.
So we managed to find the box in 3 steps. What happen if 65 is not even in the box? If 65 is not in the box, we do not have any more boxes to consider.
Let us generalize this concept. If we consider all possible values in a straight line, the idea of a binary search is as follows.
First, we let our left hand point at the smallest possible index and our right hand point at the largest possible index. We then check the middle value. For simplicity, use floor division to compute the middle value.
If the middle value is smaller than what we want, the value we want must be between middle and right hand. So we move the left hand to the middle.
Then we open the middle value again.
If the middle value is larger than what we want, the value we want must be between left hand and middle. So we move the right hand to the middle.
Then we open the middle value again.
We repeat this until we found the value.
If our hand crossed, then we know that the value is not in there.
The code is shown below.
Binary Search