Sorting Algorithms
1. Introduction
The Sorting Problem: Given an array , rearrange the elements into non-decreasing order: .
Stability: A sort is stable if elements with equal keys maintain their relative order from the input. Stability matters when sorting by multiple keys (e.g., sort by surname, then by first name).
In-place: A sort is in-place if it uses extra memory (excluding the input array).
2. Bubble Sort
Algorithm
Repeatedly step through the array, comparing adjacent pairs and swapping if they are in the wrong order. After each pass, the largest unsorted element "bubbles up" to its correct position.
def bubble_sort(A):
n = len(A)
for i in range(n - 1):
swapped = False
for j in range(n - 1 - i):
if A[j] > A[j + 1]:
A[j], A[j + 1] = A[j + 1], A[j]
swapped = True
if not swapped:
break
return A
Correctness Proof
Theorem. After the -th pass (-indexed), the largest elements are in their final positions at the end of the array.
Proof. By induction on .
Base case (). The inner loop compares each adjacent pair from index 0 to . Whenever , they are swapped. This ensures the maximum element moves rightward through every comparison until it reaches index . ✓
Inductive step. Assume after pass , the largest elements are at indices . Pass operates on indices to . By the same argument, the maximum element in this range moves to index . The largest elements are now at indices . ✓
After passes, all elements are sorted.
Complexity
- Worst case: — when the array is in reverse order
- Best case: — when the array is already sorted (early termination with
swappedflag) - Average case:
- Space: — in-place
- Stable: Yes
Proof of worst case. In reverse order, each pass performs swaps for pass . Total comparisons:
3. Insertion Sort
Algorithm
Build the sorted array one element at a time by inserting each element into its correct position among the previously sorted elements.
def insertion_sort(A):
for i in range(1, len(A)):
key = A[i]
j = i - 1
while j >= 0 and A[j] > key:
A[j + 1] = A[j]
j -= 1
A[j + 1] = key
return A
Correctness Proof
Theorem. After the -th iteration of the outer loop (), the subarray is sorted.
Proof. By induction on .
Base case (). contains at most 2 elements. If , they are swapped; otherwise, no change. Either way, is sorted. ✓
Inductive step. Assume is sorted. We insert (stored as key) by shifting
elements greater than key one position right. Since is sorted, all elements greater
than key form a contiguous suffix. After shifting, key is placed at the first position where the
element to its left is key. The resulting is sorted. ✓
Complexity
- Worst case: — reverse order
- Best case: — already sorted (inner loop never executes)
- Average case:
- Space: — in-place
- Stable: Yes
Proof of average case. On average, each insertion shifts approximately half of the sorted portion:
4. Merge Sort
Algorithm
Divide the array into two halves, recursively sort each half, then merge the two sorted halves.
def merge_sort(A):
if len(A) <= 1:
return A
mid = len(A) // 2
left = merge_sort(A[:mid])
right = merge_sort(A[mid:])
return merge(left, right)
def merge(L, R):
result = []
i = j = 0
while i < len(L) and j < len(R):
if L[i] <= R[j]:
result.append(L[i])
i += 1
else:
result.append(R[j])
j += 1
result.extend(L[i:])
result.extend(R[j:])
return result
Correctness Proof
Theorem. merge_sort(A) returns a sorted permutation of .
Proof. By structural induction on the array size .
Base case. : the array is trivially sorted. ✓
Inductive step. Assume merge_sort correctly sorts arrays of size . For an array of size
:
- Split into (size ) and (size ). By the inductive
hypothesis,
merge_sort(L)andmerge_sort(R)return sorted arrays. mergecombines them: at each step, it appends the smaller of the two front elements. This produces a sorted array (standard merge of two sorted sequences).mergeappends all remaining elements, so no elements are lost.
The result is a sorted permutation of the input. ✓
Complexity
Theorem. Merge sort runs in time.
Proof. The recurrence relation is:
The term comes from the merge step, which processes each element exactly once.
By the Master Theorem: , , . We have , which is case 2. Therefore:
Space: — the merge step requires a temporary array. Stable: Yes (merge uses <=).
5. Quick Sort
Algorithm
Select a pivot element, partition the array so that elements pivot are on the left and elements pivot are on the right, then recursively sort the two partitions.
def quick_sort(A, low=0, high=None):
if high is None:
high = len(A) - 1
if low < high:
pivot_index = partition(A, low, high)
quick_sort(A, low, pivot_index - 1)
quick_sort(A, pivot_index + 1, high)
def partition(A, low, high):
pivot = A[high]
i = low - 1
for j in range(low, high):
if A[j] < pivot:
i += 1
A[i], A[j] = A[j], A[i]
A[i + 1], A[high] = A[high], A[i + 1]
return i + 1
Correctness Proof
Theorem. After partition(A, low, high), the pivot is at its final sorted position, all
elements to its left are pivot, and all elements to its right are pivot.
Proof. The variable i tracks the boundary between elements
pivot (indices low..i) and elements pivot (indices i+1..j-1). The loop
invariant:
At the start of each iteration with index j:
- contains only elements pivot
- contains only elements pivot
- (unchanged)
Maintenance. If , increment and swap with , extending the " pivot" region. If , increment only, extending the " pivot" region.
Termination. After the loop, swap with (the pivot). Now:
- all pivot
- (final position)
- all pivot
Complexity
| Case | Time | Pivot choice |
|---|---|---|
| Best | Median | |
| Average | Random | |
| Worst | Min/max (sorted) |
Space: average (recursion stack), worst. Stable: No (partitioning swaps can change relative order).
Proof of worst case. If the pivot is always the smallest or largest element, one partition has size 0 and the other has size :
Proof of average case. With random pivot selection, the expected partition size is roughly . The recurrence is on average, giving by the Master Theorem.
6. Comparison Table
| Algorithm | Best | Average | Worst | Space | Stable |
|---|---|---|---|---|---|
| Bubble Sort | Yes | ||||
| Insertion Sort | Yes | ||||
| Merge Sort | Yes | ||||
| Quick Sort | No | ||||
| Heap Sort | No |
7. The Lower Bound for Comparison-Based Sorting
Theorem. Any comparison-based sorting algorithm requires comparisons in the worst case.
Proof. A comparison-based sorting algorithm can be modelled as a decision tree. Each internal node represents a comparison (e.g., "?"), and each leaf represents a permutation of the input (a possible output).
There are possible permutations of elements, so the decision tree has at least leaves.
A binary tree of height has at most leaves, so:
Using Stirling's approximation:
Therefore, any comparison-based sorting algorithm requires at least comparisons in the worst case.
info are asymptotically optimal among comparison-based sorts. Non-comparison sorts (radix sort, counting sort) can beat but have restrictions on key types.
Problem Set
Problem 1. Trace bubble sort on the array [5, 1, 4, 2, 8]. Show the array after each pass.
Answer
Pass 1: Compare and swap adjacent pairs. [5, 1, 4, 2, 8] → [1, 5, 4, 2, 8] → [1, 4, 5, 2, 8] →
[1, 4, 2, 5, 8] → [1, 4, 2, 5, 8] After pass 1: [1, 4, 2, 5, 8]
Pass 2: [1, 4, 2, 5, 8] → [1, 2, 4, 5, 8] → [1, 2, 4, 5, 8] → [1, 2, 4, 5, 8] After pass 2:
[1, 2, 4, 5, 8]
Pass 3: No swaps → sorted. Early termination.
Problem 2. Trace insertion sort on the array [5, 1, 4, 2, 8]. Show the array after each
insertion.
Answer
| i | key | Array state |
|---|---|---|
| 1 | 1 | [1, 5, 4, 2, 8] |
| 2 | 4 | [1, 4, 5, 2, 8] |
| 3 | 2 | [1, 2, 4, 5, 8] |
| 4 | 8 | [1, 2, 4, 5, 8] (no shift) |
Sorted: [1, 2, 4, 5, 8]
Problem 3. Show the merge process when merging [1, 3, 5] and [2, 4, 6, 8].
Answer
| Step | L remaining | R remaining | Output |
|---|---|---|---|
| 1 | [1, 3, 5] | [2, 4, 6, 8] | 1 < 2 → output 1 |
| 2 | [3, 5] | [2, 4, 6, 8] | 3 > 2 → output 2 |
| 3 | [3, 5] | [4, 6, 8] | 3 < 4 → output 3 |
| 4 | [5] | [4, 6, 8] | 5 > 4 → output 4 |
| 5 | [5] | [6, 8] | 5 < 6 → output 5 |
| 6 | [] | [6, 8] | append [6, 8] |
Result: [1, 2, 3, 4, 5, 6, 8]
Problem 4. Trace quick sort on [3, 6, 8, 10, 1, 2, 1] using the last element as pivot. Show
the array and pivot after each partition.
Answer
Call 1: quick_sort([3, 6, 8, 10, 1, 2, 1], 0, 6) Pivot = 1 (index 6). Partition:
[1, 1, 2, 10, 6, 8, 3]. Pivot index = 1. Recurse on [1] (0, 0) and [2, 10, 6, 8, 3] (2, 6).
Call 2: quick_sort([2, 10, 6, 8, 3], 2, 6) Pivot = 3 (index 6). Partition: [2, 3, 6, 8, 10].
Pivot index = 3. Recurse on [2] (2, 2) and [6, 8, 10] (4, 6).
Call 3: quick_sort([6, 8, 10], 4, 6) Pivot = 10 (index 6). Partition: [6, 8, 10]. Pivot
index = 6. Recurse on [6, 8] (4, 5) and [] (7, 6).
Call 4: quick_sort([6, 8], 4, 5) Pivot = 8 (index 5). Partition: [6, 8]. Pivot index = 5.
Final: [1, 1, 2, 3, 6, 8, 10]
Problem 5. Prove that insertion sort is stable.
Answer
Insertion sort inserts into the sorted portion by shifting elements
one position right. The condition for shifting is A[j] > key (strictly greater). If
, the element is not shifted, and key is placed after the equal
element. Therefore, equal elements maintain their relative input order.
Problem 6. A sorting algorithm makes exactly 7 comparisons to sort an array of 5 elements. Is this possible? Justify using the decision tree model.
Answer
A decision tree for sorting 5 elements must have at least leaves. A binary tree of height 7 has at most nodes and at most leaves. Since , it is theoretically possible to sort 5 elements in 7 comparisons. However, this requires a perfectly balanced decision tree (each comparison splits the remaining possibilities roughly in half), which is achievable by an optimal comparison-based sorting algorithm.
Note: , so 6 comparisons are insufficient. The minimum is comparisons.
Problem 7. When is insertion sort preferred over merge sort despite its worse asymptotic complexity?
Answer
Insertion sort is preferred when:
- The array is small (typically ): the constant factors of insertion sort are smaller
- The array is nearly sorted: insertion sort runs in where is the number of inversions
- Memory is constrained: insertion sort is in-place ( extra space) vs merge sort's
- Stability is required and quick sort's instability is a concern
Many hybrid algorithms (e.g., Timsort) use insertion sort for small subarrays within merge sort.
Problem 8. Show that quick sort's worst case occurs when the array is already sorted and the last element is chosen as pivot.
Answer
For sorted array [1, 2, 3, 4, 5] with pivot = last element:
Partition 1: pivot = 5, all elements < 5, pivot index = 4. Recurse on [1, 2, 3, 4]. Partition 2:
pivot = 4, all elements < 4, pivot index = 3. Recurse on [1, 2, 3]. Partition 3: pivot = 3, pivot
index = 2. Recurse on [1, 2]. Partition 4: pivot = 2, pivot index = 1. Recurse on [1].
Each partition processes the full remaining array. Total comparisons: .
Problem 9. Derive the recurrence relation for merge sort and solve it using the Master Theorem.
Answer
The recurrence: , .
Master Theorem: , , .
.
, which is case 2: with .
Therefore: .
Problem 10. Count the number of inversions in [2, 4, 1, 3, 5].
Answer
An inversion is a pair with and .
(2, 1): 2 > 1 ✓ (4, 1): 4 > 1 ✓ (4, 3): 4 > 3 ✓
Total inversions: 3.
Insertion sort would perform exactly 3 swaps (shifts) to sort this array.
Problem 11. Explain how to modify merge sort to count the number of inversions in an array in time.
Answer
During the merge step, when an element from the right half is placed before elements remaining in the left half, each remaining left element forms an inversion with this right element.
def merge_count(L, R):
result = []
i = j = inversions = 0
while i < len(L) and j < len(R):
if L[i] <= R[j]:
result.append(L[i])
i += 1
else:
result.append(R[j])
inversions += len(L) - i
j += 1
result.extend(L[i:])
result.extend(R[j:])
return result, inversions
When , all elements in the left half are greater than
, contributing len(L) - i inversions.
Total inversions = sum of inversions from all merge steps. Total time: .
Problem 12. Explain why non-comparison-based sorts like counting sort can achieve time, and state their limitations.
Answer
Counting sort does not compare elements. Instead, it counts the frequency of each distinct key value and uses these counts to determine positions. If the key values are integers in the range , counting sort runs in time.
Limitations:
- Only works when keys are integers (or can be mapped to integers)
- Inefficient when is very large compared to (e.g., sorting 100 elements with keys up to )
- Not comparison-based, so the lower bound does not apply
- Not in-place (requires extra space)
- Not stable in its basic form (but can be made stable)
Radix sort extends counting sort to handle larger key ranges by sorting digit by digit, achieving where is the number of digits and is the base.
For revision on data structures used in sorting, see Trees (heap sort) and Linked Lists (merge sort).
:::