Binary Search in Python

I’m reading Data Structures and Algorithms in Python, by Goodrich, Tamassia, and Goldwasser, and I think that it is very, very good. I particularly like the explanation of why binary search over a sorted list is O(\lg(n)).

Basically, whenever you repeatedly cut your search space in half, then you produce O(\lg(n)) performance. This is because the number of times you need to halve n before you get something less than or equal to 1 is b=\lg(n). Think of it this way, if we solve \dfrac{n}{2^{b}} \leq 1 for the smallest b, then we’ve found the number of times we need to halve n. Then we multiply both sides by 2^{b} and we get n \leq 2^{b}. Then we take the log base 2 of both sides and we obtain \lg(n) \leq b. Think of the \leq as saying, “yeah, \dfrac{n}{2^{B}} for B \gg b is also less than or equal to one, but that’s not interesting”.

The motivation for binary search over sorted data versus a sequential search over unsorted data is that with the former option we obtain O(\lg(n)) time complexity, whereas with the later we are stuck with O(n) time complexity.

The binary search algorithm is singly recursive, and in particular, tail call recursive, since it returns itself rather than some function of itself. By contrast, a recursive factorial function would return something like n*factorial(n-1), and a recursive Fibonacci function would return something like fib(n-1)+fib(n-2), which are a functions of themselves. Although all recursive functions can be implemented iteratively, tail call recursive functions are particularly easy to convert into an iterative form.

x = [ 2, 4, 5, 7, 8, 9, 12, 14, 17, 19, 22, 25, 27, 28, 33, 37 ]

def binarySearch( data, target, low=None, high=None ):
    if low == None and high == None:
        low, high = 0, len(data)-1
    if low > high:
        return -1
        midpoint = (high+low)/2
        if target == data[midpoint]:
            return midpoint
        elif target < data[midpoint]:
            return binarySearch( data, target, low, midpoint-1 )
        elif target > data[midpoint]:
            return binarySearch( data, target, midpoint+1, high )

# out of lower bound
assert binarySearch( x, 0 ) == -1
# lower limit
assert binarySearch( x, 2 ) == 0
# non-extant target
assert binarySearch( x, 3 ) == -1
# extant target
assert binarySearch( x, 5 ) == 2
# upper limit
assert binarySearch( x, 37 ) == 15
# out of upper bound
assert binarySearch( x, 40 ) == -1