Skip to main content

Sorting Algorithms

1. Introduction

The Sorting Problem: Given an array A[0..n1]A[0..n-1], rearrange the elements into non-decreasing order: A[0]A[1]A[n1]A[0] \leq A[1] \leq \cdots \leq A[n-1].

Stability: A sort is stable if elements with equal keys maintain their relative order from the input. Stability matters when sorting by multiple keys (e.g., sort by surname, then by first name).

In-place: A sort is in-place if it uses O(1)O(1) extra memory (excluding the input array).


2. Bubble Sort

Algorithm

Repeatedly step through the array, comparing adjacent pairs and swapping if they are in the wrong order. After each pass, the largest unsorted element "bubbles up" to its correct position.

def bubble_sort(A):
n = len(A)
for i in range(n - 1):
swapped = False
for j in range(n - 1 - i):
if A[j] > A[j + 1]:
A[j], A[j + 1] = A[j + 1], A[j]
swapped = True
if not swapped:
break
return A

Correctness Proof

Theorem. After the ii-th pass (00-indexed), the i+1i+1 largest elements are in their final positions at the end of the array.

Proof. By induction on ii.

Base case (i=0i = 0). The inner loop compares each adjacent pair from index 0 to n2n-2. Whenever A[j]>A[j+1]A[j] \gt{} A[j+1], they are swapped. This ensures the maximum element moves rightward through every comparison until it reaches index n1n-1. ✓

Inductive step. Assume after pass i1i-1, the ii largest elements are at indices ni,,n1n-i, \ldots, n-1. Pass ii operates on indices 00 to ni1n-i-1. By the same argument, the maximum element in this range moves to index ni1n-i-1. The i+1i+1 largest elements are now at indices ni1,,n1n-i-1, \ldots, n-1. ✓

After n1n-1 passes, all elements are sorted. \square

Complexity

  • Worst case: O(n2)O(n^2) — when the array is in reverse order
  • Best case: O(n)O(n) — when the array is already sorted (early termination with swapped flag)
  • Average case: O(n2)O(n^2)
  • Space: O(1)O(1) — in-place
  • Stable: Yes

Proof of worst case. In reverse order, each pass performs n1in - 1 - i swaps for pass ii. Total comparisons:

i=0n2(n1i)=k=1n1k=n(n1)2=O(n2)\sum_{i=0}^{n-2}(n - 1 - i) = \sum_{k=1}^{n-1} k = \frac{n(n-1)}{2} = O(n^2)

\square


3. Insertion Sort

Algorithm

Build the sorted array one element at a time by inserting each element into its correct position among the previously sorted elements.

def insertion_sort(A):
for i in range(1, len(A)):
key = A[i]
j = i - 1
while j >= 0 and A[j] > key:
A[j + 1] = A[j]
j -= 1
A[j + 1] = key
return A

Correctness Proof

Theorem. After the ii-th iteration of the outer loop (1i<n1 \leq i \lt{} n), the subarray A[0..i]A[0..i] is sorted.

Proof. By induction on ii.

Base case (i=1i = 1). A[0..1]A[0..1] contains at most 2 elements. If A[0]>A[1]A[0] \gt{} A[1], they are swapped; otherwise, no change. Either way, A[0..1]A[0..1] is sorted. ✓

Inductive step. Assume A[0..i1]A[0..i-1] is sorted. We insert A[i]A[i] (stored as key) by shifting elements greater than key one position right. Since A[0..i1]A[0..i-1] is sorted, all elements greater than key form a contiguous suffix. After shifting, key is placed at the first position where the element to its left is \leq key. The resulting A[0..i]A[0..i] is sorted. ✓

\square

Complexity

  • Worst case: O(n2)O(n^2) — reverse order
  • Best case: O(n)O(n) — already sorted (inner loop never executes)
  • Average case: O(n2)O(n^2)
  • Space: O(1)O(1) — in-place
  • Stable: Yes

Proof of average case. On average, each insertion shifts approximately half of the sorted portion:

T(n)=i=1n1i2=12i=1n1i=12n(n1)2=n(n1)4=O(n2)T(n) = \sum_{i=1}^{n-1} \frac{i}{2} = \frac{1}{2}\sum_{i=1}^{n-1} i = \frac{1}{2} \cdot \frac{n(n-1)}{2} = \frac{n(n-1)}{4} = O(n^2)

\square


4. Merge Sort

Algorithm

Divide the array into two halves, recursively sort each half, then merge the two sorted halves.

def merge_sort(A):
if len(A) <= 1:
return A
mid = len(A) // 2
left = merge_sort(A[:mid])
right = merge_sort(A[mid:])
return merge(left, right)

def merge(L, R):
result = []
i = j = 0
while i < len(L) and j < len(R):
if L[i] <= R[j]:
result.append(L[i])
i += 1
else:
result.append(R[j])
j += 1
result.extend(L[i:])
result.extend(R[j:])
return result

Correctness Proof

Theorem. merge_sort(A) returns a sorted permutation of AA.

Proof. By structural induction on the array size nn.

Base case. n1n \leq 1: the array is trivially sorted. ✓

Inductive step. Assume merge_sort correctly sorts arrays of size <n\lt{} n. For an array of size nn:

  1. Split into LL (size n/2\lfloor n/2 \rfloor) and RR (size n/2\lceil n/2 \rceil). By the inductive hypothesis, merge_sort(L) and merge_sort(R) return sorted arrays.
  2. merge combines them: at each step, it appends the smaller of the two front elements. This produces a sorted array (standard merge of two sorted sequences).
  3. merge appends all remaining elements, so no elements are lost.

The result is a sorted permutation of the input. ✓ \square

Complexity

Theorem. Merge sort runs in O(nlogn)O(n \log n) time.

Proof. The recurrence relation is:

T(n)=2T(n/2)+O(n),T(1)=O(1)T(n) = 2T(n/2) + O(n), \quad T(1) = O(1)

The O(n)O(n) term comes from the merge step, which processes each element exactly once.

By the Master Theorem: a=2a = 2, b=2b = 2, f(n)=O(n)f(n) = O(n). We have f(n)=O(nlogba)=O(n1)=O(n)f(n) = O(n^{\log_b a}) = O(n^1) = O(n), which is case 2. Therefore:

T(n)=O(nlogn)T(n) = O(n \log n)

Space: O(n)O(n) — the merge step requires a temporary array. Stable: Yes (merge uses <=).


5. Quick Sort

Algorithm

Select a pivot element, partition the array so that elements <\lt{} pivot are on the left and elements \geq pivot are on the right, then recursively sort the two partitions.

def quick_sort(A, low=0, high=None):
if high is None:
high = len(A) - 1
if low < high:
pivot_index = partition(A, low, high)
quick_sort(A, low, pivot_index - 1)
quick_sort(A, pivot_index + 1, high)

def partition(A, low, high):
pivot = A[high]
i = low - 1
for j in range(low, high):
if A[j] < pivot:
i += 1
A[i], A[j] = A[j], A[i]
A[i + 1], A[high] = A[high], A[i + 1]
return i + 1

Correctness Proof

Theorem. After partition(A, low, high), the pivot is at its final sorted position, all elements to its left are <\lt{} pivot, and all elements to its right are \geq pivot.

Proof. The variable i tracks the boundary between elements <\lt{} pivot (indices low..i) and elements \geq pivot (indices i+1..j-1). The loop invariant:

At the start of each iteration with index j:

  • A[low..i]A[\mathrm{low}..i] contains only elements <\lt{} pivot
  • A[i+1..j1]A[i+1..j-1] contains only elements \geq pivot
  • A[high]=pivotA[\mathrm{high}] = \mathrm{pivot} (unchanged)

Maintenance. If A[j]<pivotA[j] \lt{} \mathrm{pivot}, increment ii and swap A[i]A[i] with A[j]A[j], extending the "<\lt{} pivot" region. If A[j]pivotA[j] \geq \mathrm{pivot}, increment jj only, extending the "\geq pivot" region.

Termination. After the loop, swap A[i+1]A[i+1] with A[high]A[\mathrm{high}] (the pivot). Now:

  • A[low..i]A[\mathrm{low}..i] all <\lt{} pivot
  • A[i+1]=pivotA[i+1] = \mathrm{pivot} (final position)
  • A[i+2..high]A[i+2..\mathrm{high}] all \geq pivot

\square

Complexity

CaseTimePivot choice
BestO(nlogn)O(n \log n)Median
AverageO(nlogn)O(n \log n)Random
WorstO(n2)O(n^2)Min/max (sorted)

Space: O(logn)O(\log n) average (recursion stack), O(n)O(n) worst. Stable: No (partitioning swaps can change relative order).

Proof of worst case. If the pivot is always the smallest or largest element, one partition has size 0 and the other has size n1n-1:

T(n)=T(n1)+O(n)=k=1nO(k)=O(n2)T(n) = T(n-1) + O(n) = \sum_{k=1}^{n} O(k) = O(n^2)

Proof of average case. With random pivot selection, the expected partition size is roughly n/2n/2. The recurrence is T(n)=T(n/2)+O(n)T(n) = T(n/2) + O(n) on average, giving T(n)=O(nlogn)T(n) = O(n \log n) by the Master Theorem.


6. Comparison Table

AlgorithmBestAverageWorstSpaceStable
Bubble SortO(n)O(n)O(n2)O(n^2)O(n2)O(n^2)O(1)O(1)Yes
Insertion SortO(n)O(n)O(n2)O(n^2)O(n2)O(n^2)O(1)O(1)Yes
Merge SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n)O(n)Yes
Quick SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n2)O(n^2)O(logn)O(\log n)No
Heap SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(1)O(1)No

7. The Ω(nlogn)\Omega(n \log n) Lower Bound for Comparison-Based Sorting

Theorem. Any comparison-based sorting algorithm requires Ω(nlogn)\Omega(n \log n) comparisons in the worst case.

Proof. A comparison-based sorting algorithm can be modelled as a decision tree. Each internal node represents a comparison (e.g., "A[i]A[j]A[i] \leq A[j]?"), and each leaf represents a permutation of the input (a possible output).

There are n!n! possible permutations of nn elements, so the decision tree has at least n!n! leaves.

A binary tree of height hh has at most 2h2^h leaves, so:

2hn!    hlog2(n!)2^h \geq n! \implies h \geq \log_2(n!)

Using Stirling's approximation: n!(ne)nLB2πnRBn! \approx \left(\frac{n}{e}\right)^n \sqrt◆LB◆2\pi n◆RB◆

log2(n!)=nlog2nnlog2e+O(logn)=Ω(nlogn)\log_2(n!) = n\log_2 n - n\log_2 e + O(\log n) = \Omega(n \log n)

Therefore, any comparison-based sorting algorithm requires at least Ω(nlogn)\Omega(n \log n) comparisons in the worst case. \square

info

info are asymptotically optimal among comparison-based sorts. Non-comparison sorts (radix sort, counting sort) can beat O(nlogn)O(n \log n) but have restrictions on key types.


Problem Set

Problem 1. Trace bubble sort on the array [5, 1, 4, 2, 8]. Show the array after each pass.

Answer

Pass 1: Compare and swap adjacent pairs. [5, 1, 4, 2, 8][1, 5, 4, 2, 8][1, 4, 5, 2, 8][1, 4, 2, 5, 8][1, 4, 2, 5, 8] After pass 1: [1, 4, 2, 5, 8]

Pass 2: [1, 4, 2, 5, 8][1, 2, 4, 5, 8][1, 2, 4, 5, 8][1, 2, 4, 5, 8] After pass 2: [1, 2, 4, 5, 8]

Pass 3: No swaps → sorted. Early termination.

Problem 2. Trace insertion sort on the array [5, 1, 4, 2, 8]. Show the array after each insertion.

Answer
ikeyArray state
11[1, 5, 4, 2, 8]
24[1, 4, 5, 2, 8]
32[1, 2, 4, 5, 8]
48[1, 2, 4, 5, 8] (no shift)

Sorted: [1, 2, 4, 5, 8]

Problem 3. Show the merge process when merging [1, 3, 5] and [2, 4, 6, 8].

Answer
StepL remainingR remainingOutput
1[1, 3, 5][2, 4, 6, 8]1 < 2 → output 1
2[3, 5][2, 4, 6, 8]3 > 2 → output 2
3[3, 5][4, 6, 8]3 < 4 → output 3
4[5][4, 6, 8]5 > 4 → output 4
5[5][6, 8]5 < 6 → output 5
6[][6, 8]append [6, 8]

Result: [1, 2, 3, 4, 5, 6, 8]

Problem 4. Trace quick sort on [3, 6, 8, 10, 1, 2, 1] using the last element as pivot. Show the array and pivot after each partition.

Answer

Call 1: quick_sort([3, 6, 8, 10, 1, 2, 1], 0, 6) Pivot = 1 (index 6). Partition: [1, 1, 2, 10, 6, 8, 3]. Pivot index = 1. Recurse on [1] (0, 0) and [2, 10, 6, 8, 3] (2, 6).

Call 2: quick_sort([2, 10, 6, 8, 3], 2, 6) Pivot = 3 (index 6). Partition: [2, 3, 6, 8, 10]. Pivot index = 3. Recurse on [2] (2, 2) and [6, 8, 10] (4, 6).

Call 3: quick_sort([6, 8, 10], 4, 6) Pivot = 10 (index 6). Partition: [6, 8, 10]. Pivot index = 6. Recurse on [6, 8] (4, 5) and [] (7, 6).

Call 4: quick_sort([6, 8], 4, 5) Pivot = 8 (index 5). Partition: [6, 8]. Pivot index = 5.

Final: [1, 1, 2, 3, 6, 8, 10]

Problem 5. Prove that insertion sort is stable.

Answer

Insertion sort inserts A[i]A[i] into the sorted portion A[0..i1]A[0..i-1] by shifting elements >A[i]\gt{} A[i] one position right. The condition for shifting is A[j] > key (strictly greater). If A[j]=keyA[j] = \mathrm{key}, the element is not shifted, and key is placed after the equal element. Therefore, equal elements maintain their relative input order. \square

Problem 6. A sorting algorithm makes exactly 7 comparisons to sort an array of 5 elements. Is this possible? Justify using the decision tree model.

Answer

A decision tree for sorting 5 elements must have at least 5!=1205! = 120 leaves. A binary tree of height 7 has at most 281=2552^8 - 1 = 255 nodes and at most 27=1282^7 = 128 leaves. Since 128120128 \geq 120, it is theoretically possible to sort 5 elements in 7 comparisons. However, this requires a perfectly balanced decision tree (each comparison splits the remaining possibilities roughly in half), which is achievable by an optimal comparison-based sorting algorithm.

Note: 26=64<1202^6 = 64 \lt{} 120, so 6 comparisons are insufficient. The minimum is log2120=7\lceil \log_2 120 \rceil = 7 comparisons.

Problem 7. When is insertion sort preferred over merge sort despite its worse asymptotic complexity?

Answer

Insertion sort is preferred when:

  1. The array is small (typically n<20n \lt{} 20): the constant factors of insertion sort are smaller
  2. The array is nearly sorted: insertion sort runs in O(n+d)O(n + d) where dd is the number of inversions
  3. Memory is constrained: insertion sort is in-place (O(1)O(1) extra space) vs merge sort's O(n)O(n)
  4. Stability is required and quick sort's instability is a concern

Many hybrid algorithms (e.g., Timsort) use insertion sort for small subarrays within merge sort.

Problem 8. Show that quick sort's worst case occurs when the array is already sorted and the last element is chosen as pivot.

Answer

For sorted array [1, 2, 3, 4, 5] with pivot = last element:

Partition 1: pivot = 5, all elements < 5, pivot index = 4. Recurse on [1, 2, 3, 4]. Partition 2: pivot = 4, all elements < 4, pivot index = 3. Recurse on [1, 2, 3]. Partition 3: pivot = 3, pivot index = 2. Recurse on [1, 2]. Partition 4: pivot = 2, pivot index = 1. Recurse on [1].

Each partition processes the full remaining array. Total comparisons: (n1)+(n2)++1=n(n1)/2=O(n2)(n-1) + (n-2) + \cdots + 1 = n(n-1)/2 = O(n^2).

Problem 9. Derive the recurrence relation for merge sort and solve it using the Master Theorem.

Answer

The recurrence: T(n)=2T(n/2)+cnT(n) = 2T(n/2) + cn, T(1)=dT(1) = d.

Master Theorem: a=2a = 2, b=2b = 2, f(n)=cnf(n) = cn.

nlogba=nlog22=n1=nn^{\log_b a} = n^{\log_2 2} = n^1 = n.

f(n)=cn=O(n1)f(n) = cn = O(n^1), which is case 2: f(n)=Θ(nlogbalogkn)f(n) = \Theta(n^{\log_b a} \log^k n) with k=0k = 0.

Therefore: T(n)=Θ(nlogn)T(n) = \Theta(n \log n).

Problem 10. Count the number of inversions in [2, 4, 1, 3, 5].

Answer

An inversion is a pair (i,j)(i, j) with i<ji \lt{} j and A[i]>A[j]A[i] \gt{} A[j].

(2, 1): 2 > 1 ✓ (4, 1): 4 > 1 ✓ (4, 3): 4 > 3 ✓

Total inversions: 3.

Insertion sort would perform exactly 3 swaps (shifts) to sort this array.

Problem 11. Explain how to modify merge sort to count the number of inversions in an array in O(nlogn)O(n \log n) time.

Answer

During the merge step, when an element from the right half is placed before elements remaining in the left half, each remaining left element forms an inversion with this right element.

def merge_count(L, R):
result = []
i = j = inversions = 0
while i < len(L) and j < len(R):
if L[i] <= R[j]:
result.append(L[i])
i += 1
else:
result.append(R[j])
inversions += len(L) - i
j += 1
result.extend(L[i:])
result.extend(R[j:])
return result, inversions

When R[j]<L[i]R[j] \lt{} L[i], all elements L[i],L[i+1],L[i], L[i+1], \ldots in the left half are greater than R[j]R[j], contributing len(L) - i inversions.

Total inversions = sum of inversions from all merge steps. Total time: O(nlogn)O(n \log n).

Problem 12. Explain why non-comparison-based sorts like counting sort can achieve O(n)O(n) time, and state their limitations.

Answer

Counting sort does not compare elements. Instead, it counts the frequency of each distinct key value and uses these counts to determine positions. If the key values are integers in the range [0,k][0, k], counting sort runs in O(n+k)O(n + k) time.

Limitations:

  1. Only works when keys are integers (or can be mapped to integers)
  2. Inefficient when kk is very large compared to nn (e.g., sorting 100 elements with keys up to 10910^9)
  3. Not comparison-based, so the Ω(nlogn)\Omega(n \log n) lower bound does not apply
  4. Not in-place (requires O(n+k)O(n + k) extra space)
  5. Not stable in its basic form (but can be made stable)

Radix sort extends counting sort to handle larger key ranges by sorting digit by digit, achieving O(d(n+b))O(d(n + b)) where dd is the number of digits and bb is the base.

For revision on data structures used in sorting, see Trees (heap sort) and Linked Lists (merge sort).

:::