Binary Search in Python | Connor Johnson

I’m reading Data Structures and Algorithms in Python, by Goodrich, Tamassia, and Goldwasser, and I think that it is very, very good. I particularly like the explanation of why binary search over a sorted list is $O(\lg(n))$ .

Basically, whenever you repeatedly cut your search space in half, then you produce $O(\lg(n))$ performance. This is because the number of times you need to halve $n$ before you get something less than or equal to 1 is $b=\lg(n)$ . Think of it this way, if we solve $\dfrac{n}{2^{b}} \leq 1$ for the smallest $b$ , then we’ve found the number of times we need to halve $n$ . Then we multiply both sides by $2^{b}$ and we get $n \leq 2^{b}$ . Then we take the log base 2 of both sides and we obtain $\lg(n) \leq b$ . Think of the $\leq$ as saying, “yeah, $\dfrac{n}{2^{B}}$ for $B \gg b$ is also less than or equal to one, but that’s not interesting”.

The motivation for binary search over sorted data versus a sequential search over unsorted data is that with the former option we obtain $O(\lg(n))$ time complexity, whereas with the later we are stuck with $O(n)$ time complexity.

The binary search algorithm is singly recursive, and in particular, tail call recursive, since it returns itself rather than some function of itself. By contrast, a recursive factorial function would return something like n*factorial(n-1), and a recursive Fibonacci function would return something like fib(n-1)+fib(n-2), which are a functions of themselves. Although all recursive functions can be implemented iteratively, tail call recursive functions are particularly easy to convert into an iterative form.

x = [ 2, 4, 5, 7, 8, 9, 12, 14, 17, 19, 22, 25, 27, 28, 33, 37 ]

def binarySearch( data, target, low=None, high=None ):
    if low == None and high == None:
        low, high = 0, len(data)-1
    if low > high:
        return -1
    else:
        midpoint = (high+low)/2
        if target == data[midpoint]:
            return midpoint
        elif target < data[midpoint]:
            return binarySearch( data, target, low, midpoint-1 )
        elif target > data[midpoint]:
            return binarySearch( data, target, midpoint+1, high )

# out of lower bound
assert binarySearch( x, 0 ) == -1
# lower limit
assert binarySearch( x, 2 ) == 0
# non-extant target
assert binarySearch( x, 3 ) == -1
# extant target
assert binarySearch( x, 5 ) == 2
# upper limit
assert binarySearch( x, 37 ) == 15
# out of upper bound
assert binarySearch( x, 40 ) == -1

Blog about math, programming, and data.