Sorting Algorithms in Python

Tonight I’m looking at some sorting algorithms in Python. First up are Bubble Sort and Selection sort, both with O(n^{2}) average and worst case runtime, and O(1) memory. Then I’ll look at an iterative and recursive implementation of Merge Sort, Quick Sort, and Heap Sort with O(n \log n).

Bubble Sort

This algorithm walks through a list several times, swapping consecutive out-of-order elements as it sees them. The worst case is when the list is in descending order when you want it ascending. The outer for-loop might look unnecessary, but go ahead, take it out and see what happens. You’ll see why this one is O(n^{2}).

def bubbleSort(x):
    n = len(x)
    if n <= 1:
        return x
    for _ in range(n):
        for i in range(1,n):
            # if the prev element > next element
            if x[i-1] > x[i]:
                # make the swap
                x[i-1], x[i] = x[i], x[i-1]
    return x

assert bubbleSort([5, 4, 3, 2, 1]) == [1, 2, 3, 4, 5]

Selection Sort

While Bubble Sort is unexpectedly expensive with its outer for-loop, Selection Sort is unexpectedly complex compared to its description. Selection Sort works by finding the smallest element, and swapping it out for the first element. Then it finds the smallest element starting from the second element to the end, and then it swaps that smallest element for the second element, and so on. Here’s what I worked out. (The variable name cursor was pretty random, but I couldn’t deal with another single-letter-variable.)

def selectionSort(x):
    n = len(x)
    if n <= 1:
        return x
    for i in range(n):
        cursor = i
        for j in range(i,n):
            # find the smallest element
            if x[j] < x[cursor]:
                cursor = j
        # make the swap
        x[i], x[cursor] = x[cursor], x[i]
    return x

assert selectionSort([5, 4, 3, 2, 1]) == [1, 2, 3, 4, 5]

Merge Sort

This one was tricky. This book I’m reading says that, “Okay, so first you split everything up and then you just merge things back together and sort them as you go and, like, voila.” The trick is that you do two recursive calls in the same function. I’d never seen that before. It felt like seeing someone castle for the first time. Anyway, I did an iterative version also, because Python by design doesn’t like recursion.

def merge(a, b):
    x = []
    while len(a) > 0 and len(b) > 0:
        if a[0] < b[0]:
            x.append(a.pop(0))
        else:
            x.append(b.pop(0))
    while len(a) > 0:
        x.append(a.pop(0))
    while len(b) > 0:
        x.append(b.pop(0))
    return x

def halve(x):
    half = len(x) / 2
    return x[:half], x[half:]

def recursiveMergeSort(x):
    if len(x) == 1:
        return x
    a, b = halve(x)
    a = recursiveMergeSort(a)
    b = recursiveMergeSort(b)
    return merge(a, b)
        
def iterativeMergeSort(x):
    queue = [[i] for i in x]
    while len( queue ) > 1:
        a, b = queue.pop(0), queue.pop(0)
        queue.append(merge(a, b))
    return queue[0]

x = [40, 82, 97, 47, 7]
assert recursiveMergeSort(x) == [7, 40, 47, 82, 97]
assert iterativeMergeSort(x) == [7, 40, 47, 82, 97]

Quick Sort

import random

def pivot(x):
    return random.randint(0, len(x) - 1)

def partition(x):
    if len(x) == 2:
        lt = [min(x)]
        gt = [max(x)]
    else:
        pvt = pivot( x )
        lt = [i for i in x if i <= pvt]
        gt = [i for i in x if i > pvt]
    return lt, gt

def quicksort(x):
    if len(x) < 2:
        return x
    lt, gt = partition(x)
    lt = quicksort(lt)
    gt = quicksort(gt)
    return lt+gt

assert quicksort([0]) == [0]
assert quicksort([0, 0]) == [0, 0]
assert quicksort([0, 1]) == [0, 1]
assert quicksort([1, 0]) == [0, 1]

Heap Sort

Heap sort enjoys O(n \log n) for best, average, and worst cases.

from typing import *
from dataclasses import dataclass, field

@dataclass
class Heap:

    A: List[int]

    def swap(self, i, j):
        print('\nSwap indices {} and {}'.format(i, j))
        self.A[i], self.A[j] = self.A[j], self.A[i]
        print('{}'.format(self.A))

    def sort(self):
        print('Sorting: {}'.format(self.A))
        self.build_heap()
        print('\n' + '-' * 40)
        N = len(self.A)
        for i in range(N - 1, -1, -1):
            print('\nIndex: {}'.format(i))
            self.swap(0, i)
            self.heapify(0, i)
        return self.A

    def build_heap(self):
        N = len(self.A)
        for i in range(N // 2 - 1, -1, -1):
            self.heapify(i, N)

    def heapify(self, idx, maximum):
        print('heapify({}, {})'.format(idx, maximum))
        largest = idx
        left = 2 * idx + 1
        right = 2 * idx + 2

        if left < maximum and self.A[left] > self.A[idx]:
            largest = left
        if right < maximum and self.A[right] > self.A[largest]:
            largest = right
        if largest != idx:
            self.swap(idx, largest)
            self.heapify(largest, maximum)

assert Heap([8, 11, 9, 2, 10, 16]).sort() == [2, 8, 9, 10, 11, 16]