Skip to main content

Complexity Analysis

1. Formal Definitions

Asymptotic Notation

Let f,g:NR+f, g: \mathbb{N} \to \mathbb{R}^+ be functions.

Big-O (Upper Bound)

f(n)=O(g(n))    c>0, n0Ns.t.f(n)cg(n) nn0f(n) = O(g(n)) \iff \exists\, c > 0,\ n_0 \in \mathbb{N} \mathrm{ s.t. } f(n) \leq c \cdot g(n)\ \forall\, n \geq n_0

Intuition: ff is eventually bounded above by a constant multiple of gg.

Big-Omega (Lower Bound)

f(n)=Ω(g(n))    c>0, n0Ns.t.f(n)cg(n) nn0f(n) = \Omega(g(n)) \iff \exists\, c > 0,\ n_0 \in \mathbb{N} \mathrm{ s.t. } f(n) \geq c \cdot g(n)\ \forall\, n \geq n_0

Intuition: ff is eventually bounded below by a constant multiple of gg.

Big-Theta (Tight Bound)

f(n)=Θ(g(n))    f(n)=O(g(n))f(n)=Ω(g(n))f(n) = \Theta(g(n)) \iff f(n) = O(g(n)) \land f(n) = \Omega(g(n))

    c1,c2>0, n0Ns.t.c1g(n)f(n)c2g(n) nn0\iff \exists\, c_1, c_2 > 0,\ n_0 \in \mathbb{N} \mathrm{ s.t. } c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n)\ \forall\, n \geq n_0

Intuition: ff grows at the same rate as gg, up to constant factors.

Little-o (Strict Upper Bound)

f(n)=o(g(n))    limnf(n)g(n)=0f(n) = o(g(n)) \iff \lim_{n \to \infty} \frac{f(n)}{g(n)} = 0

Little-omega (Strict Lower Bound)

f(n)=ω(g(n))    limnf(n)g(n)=f(n) = \omega(g(n)) \iff \lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty


2. The Complexity Hierarchy

Theorem. For polynomial-exponential functions, the following hierarchy holds:

O(1)o(logn)O(logn)o(n)O(n)o(n)O(n)o(nlogn)O(nlogn)o(n2)O(n2)O(2n)o(n!)O(1) \subset o(\log n) \subset O(\log n) \subset o(\sqrt{n}) \subset O(\sqrt{n}) \subset o(n) \subset O(n) \subset o(n \log n) \subset O(n \log n) \subset o(n^2) \subset O(n^2) \subset \cdots \subset O(2^n) \subset o(n!)

Proof (selected). We prove n=o(nlogn)n = o(n \log n):

limnLBnRB◆◆LBnlognRB=limnLB1RB◆◆LBlognRB=0\lim_{n \to \infty} \frac◆LB◆n◆RB◆◆LB◆n \log n◆RB◆ = \lim_{n \to \infty} \frac◆LB◆1◆RB◆◆LB◆\log n◆RB◆ = 0

Similarly, logn=o(n)\log n = o(n):

limnLBlognRB◆◆LBnRB=0(byLHo^pitalsrule)\lim_{n \to \infty} \frac◆LB◆\log n◆RB◆◆LB◆n◆RB◆ = 0 \quad \mathrm{(by L'Hôpital's rule)}

And nk=o(2n)n^k = o(2^n) for any constant kk:

limnnk2n=0(exponentialdominatespolynomial)\lim_{n \to \infty} \frac{n^k}{2^n} = 0 \quad \mathrm{(exponential dominates polynomial)}

\square

Common Complexity Classes

ClassNameExample
O(1)O(1)ConstantArray access
O(logn)O(\log n)LogarithmicBinary search
O(n)O(n)LinearLinear search
O(nlogn)O(n \log n)LinearithmicMerge sort, heap sort
O(n2)O(n^2)QuadraticBubble sort, insertion sort
O(n3)O(n^3)CubicFloyd-Warshall, matrix multiply
O(2n)O(2^n)ExponentialSubset enumeration, brute force
O(n!)O(n!)FactorialPermutation enumeration (TSP)
info

Board-specific AQA requires Big-O notation for standard algorithms (searching, sorting); focuses on time complexity. CIE (9618) covers Big-O, Big-Theta, and Big-Omega notation; requires space complexity analysis. OCR (A) requires Big-O notation; may require comparison of algorithm performance. Edexcel covers time and space complexity with Big-O notation.


3. Best, Average, and Worst Case

Definitions

For an algorithm with input size nn, let DnD_n be the set of all possible inputs of size nn, and let T(A,I)T(A, I) be the running time of algorithm AA on input II.

  • Best case: Tbest(n)=minIDnT(A,I)T_{\mathrm{best}}(n) = \min_{I \in D_n} T(A, I)
  • Worst case: Tworst(n)=maxIDnT(A,I)T_{\mathrm{worst}}(n) = \max_{I \in D_n} T(A, I)
  • Average case: Tavg(n)=IDnT(A,I)P(I)T_{\mathrm{avg}}(n) = \sum_{I \in D_n} T(A, I) \cdot P(I)

where P(I)P(I) is the probability of input II.

Example: Quick Sort

CaseComplexityWhen
BestO(nlogn)O(n \log n)Pivot always splits in half
AverageO(nlogn)O(n \log n)Random inputs (expected)
WorstO(n2)O(n^2)Already sorted, min/max pivot
warning

Pitfall Average case assumes a uniform distribution of inputs. Real-world data may not be uniformly distributed. Always state the distribution assumption when discussing average case.


4. Analyzing Recursive Algorithms

The Master Theorem

For recurrences of the form T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n) where a1a \geq 1, b>1b \gt{} 1:

Let c=logbac = \log_b a. Compare f(n)f(n) with ncn^c:

CaseConditionSolution
1f(n)=O(ncϵ)f(n) = O(n^{c - \epsilon})T(n)=Θ(nc)T(n) = \Theta(n^c)
2f(n)=Θ(nclogkn)f(n) = \Theta(n^c \log^k n)T(n)=Θ(nclogk+1n)T(n) = \Theta(n^c \log^{k+1} n)
3f(n)=Ω(nc+ϵ)f(n) = \Omega(n^{c + \epsilon}) and af(n/b)cf(n)af(n/b) \leq cf(n)T(n)=Θ(f(n))T(n) = \Theta(f(n))

Worked Examples

Example 1: Merge sort: T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n)

a=2,b=2,c=log22=1a = 2, b = 2, c = \log_2 2 = 1. f(n)=O(n1)=O(nc)f(n) = O(n^1) = O(n^c). Case 2 with k=0k = 0: T(n)=Θ(nlogn)T(n) = \Theta(n \log n).

Example 2: Binary search: T(n)=T(n/2)+O(1)T(n) = T(n/2) + O(1)

a=1,b=2,c=log21=0a = 1, b = 2, c = \log_2 1 = 0. f(n)=O(1)=O(n0)=O(nc)f(n) = O(1) = O(n^0) = O(n^c). Case 2 with k=0k = 0: T(n)=Θ(logn)T(n) = \Theta(\log n).

Example 3: T(n)=3T(n/2)+O(n)T(n) = 3T(n/2) + O(n)

a=3,b=2,c=log231.585a = 3, b = 2, c = \log_2 3 \approx 1.585. f(n)=O(n)=O(n1.5850.585)=O(ncϵ)f(n) = O(n) = O(n^{1.585 - 0.585}) = O(n^{c - \epsilon}). Case 1: T(n)=Θ(nlog23)T(n) = \Theta(n^{\log_2 3}).

Example 4: T(n)=2T(n/2)+O(n2)T(n) = 2T(n/2) + O(n^2)

a=2,b=2,c=1a = 2, b = 2, c = 1. f(n)=O(n2)=Ω(n1+1)=Ω(nc+ϵ)f(n) = O(n^2) = \Omega(n^{1+1}) = \Omega(n^{c+\epsilon}). Check regularity: 2(n/2)2=n2/2cn22(n/2)^2 = n^2/2 \leq c \cdot n^2 for c=1/2c = 1/2. Case 3: T(n)=Θ(n2)T(n) = \Theta(n^2).


5. Amortized Analysis

Motivation

Some operations are expensive in isolation but cheap on average. Amortized analysis gives the average cost per operation over a worst-case sequence of operations (not average over random inputs).

Methods

Aggregate Method

Compute the total cost of nn operations and divide by nn.

Example: Dynamic array. Total cost of nn insertions: O(n)O(n) (proved in Arrays and Records). Amortised cost per insertion: O(1)O(1).

Accounting Method

Assign an amortised cost to each operation. Some operations are charged more than their actual cost (creating a "credit"); others are charged less (consuming credit). The credit must always be non-negative.

Example: Dynamic array. Charge \3perinsertion:per insertion:$1fortheactualinsert,for the actual insert,$2savedforfutureresizing.Whenaresizeofsaved for future resizing. When a resize ofkelementsoccurs,itcostselements occurs, it costsO(k),whichiscoveredbythe, which is covered by the 2kcreditaccumulatedfromthecredit accumulated from thek$ insertions.

Potential Method

Define a potential function Φ\Phi mapping data structure states to non-negative real numbers. The amortised cost of operation ii is:

c^i=ci+Φ(Di)Φ(Di1)\hat{c}_i = c_i + \Phi(D_i) - \Phi(D_{i-1})

Total amortised cost: i=1nc^i=i=1nci+Φ(Dn)Φ(D0)\sum_{i=1}^{n} \hat{c}_i = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_0)

Since Φ0\Phi \geq 0 and Φ(D0)=0\Phi(D_0) = 0: c^ici\sum \hat{c}_i \geq \sum c_i.

Example: Dynamic array. Let Φ=2(sizecapacity/2)\Phi = 2(\mathrm{size} - \mathrm{capacity}/2) when size \geq capacity/2, else 0.

  • Insert without resize: Actual cost = 1. Φ\Phi increases by 2. Amortised cost = 1+2=31 + 2 = 3.
  • Insert with resize from kk to 2k2k: Actual cost = kk (copy) + 1 (insert). Before: size = kk, capacity = kk, Φ=2(kk/2)=k\Phi = 2(k - k/2) = k. After: size = k+1k+1, capacity = 2k2k, Φ=2(k+1k)=2\Phi = 2(k+1 - k) = 2. Change: 2k2 - k. Amortised = (k+1)+(2k)=3(k + 1) + (2 - k) = 3.

Amortised cost per operation: O(3)=O(1)O(3) = O(1). \square


6. Practical Analysis

Space-Time Tradeoffs

Sometimes we can reduce time complexity by using more memory, or reduce space by accepting more time.

TradeoffExample
Hash tableO(1)O(1) search vs O(n)O(n) space
MemoizationO(1)O(1) repeated lookups vs O(n)O(n) space
Precomputed tablesO(1)O(1) trig functions vs large ROM

Logarithmic Factors

logn\log n grows very slowly. For all practical input sizes, O(nlogn)O(n \log n) is often acceptable even when O(n)O(n) is achievable with more complex algorithms.

nnlog2n\log_2 nnlog2nn \log_2 n
10310^31010410^4
10610^6202×1072 \times 10^7
10910^9303×10103 \times 10^{10}

Problem Set

Problem 1. Prove that 3n2+7n+4=O(n2)3n^2 + 7n + 4 = O(n^2) using the formal definition.

Answer

We need to find c>0c > 0 and n0n_0 such that 3n2+7n+4cn23n^2 + 7n + 4 \leq c \cdot n^2 for all nn0n \geq n_0.

For n1n \geq 1: 3n2+7n+43n2+7n2+4n2=14n23n^2 + 7n + 4 \leq 3n^2 + 7n^2 + 4n^2 = 14n^2.

Choose c=14c = 14 and n0=1n_0 = 1. ✓

(More tightly: c=4,n0=8c = 4, n_0 = 8 also works.)

Problem 2. Prove that n2O(n)n^2 \neq O(n).

Answer

Proof by contradiction. Assume n2=O(n)n^2 = O(n). Then c>0,n0\exists\, c > 0, n_0 such that n2cnn^2 \leq cn for all nn0n \geq n_0.

This implies ncn \leq c for all nn0n \geq n_0. But nn grows without bound, so this is impossible for any fixed cc. Contradiction. \square

Equivalently: limnn2/n=limnn=0\lim_{n \to \infty} n^2 / n = \lim_{n \to \infty} n = \infty \neq 0, so n2O(n)n^2 \neq O(n).

Problem 3. Use the Master Theorem to solve T(n)=4T(n/2)+n2lognT(n) = 4T(n/2) + n^2 \log n.

Answer

a=4a = 4, b=2b = 2, c=log24=2c = \log_2 4 = 2.

f(n)=n2logn=nclog1nf(n) = n^2 \log n = n^c \log^1 n.

This is Case 2 with k=1k = 1: T(n)=Θ(nclogk+1n)=Θ(n2log2n)T(n) = \Theta(n^c \log^{k+1} n) = \Theta(n^2 \log^2 n).

Problem 4. Determine the time complexity of the following function:

def mystery(n):
count = 0
i = 1
while i < n:
j = 1
while j < n:
count += 1
j *= 2
i *= 3
return count
Answer

Outer loop: ii takes values 1,3,9,27,,3k<n1, 3, 9, 27, \ldots, 3^k \lt{} n. Number of iterations: log3n\lceil \log_3 n \rceil.

Inner loop: jj takes values 1,2,4,8,,2m<n1, 2, 4, 8, \ldots, 2^m \lt{} n. Number of iterations: log2n\lceil \log_2 n \rceil.

Total: O(log3nlog2n)=O(log2n)O(\log_3 n \cdot \log_2 n) = O(\log^2 n).

Problem 5. Show that log(n!)=Θ(nlogn)\log(n!) = \Theta(n \log n).

Answer

Upper bound: log(n!)=i=1nlogii=1nlogn=nlogn\log(n!) = \sum_{i=1}^{n} \log i \leq \sum_{i=1}^{n} \log n = n \log n.

Lower bound: log(n!)=i=1nlogii=n/2nlogin2log(n/2)=n2(logn1)=Ω(nlogn)\log(n!) = \sum_{i=1}^{n} \log i \geq \sum_{i=\lceil n/2 \rceil}^{n} \log i \geq \frac{n}{2} \cdot \log(n/2) = \frac{n}{2}(\log n - 1) = \Omega(n \log n).

Therefore: log(n!)=Θ(nlogn)\log(n!) = \Theta(n \log n). \square

Problem 6. A student claims that an algorithm with time complexity O(n3)O(n^3) is always slower than one with complexity O(n2logn)O(n^2 \log n). Is this correct? Explain.

Answer

No. Big-O is an asymptotic upper bound — it describes behaviour as nn \to \infty. For small nn, the O(n3)O(n^3) algorithm might be faster due to smaller constant factors.

Example: Algorithm A takes 1000n2logn1000n^2 \log n operations, Algorithm B takes n3n^3 operations. For n=10n = 10: A ≈ 1000×100×3.3=330,0001000 \times 100 \times 3.3 = 330,000; B = 10001000. B is much faster.

The crossover point is where n3=1000n2lognn^3 = 1000n^2 \log n, i.e., n=1000lognn = 1000\log n, which is at n13,000n \approx 13,000. Below this, B is faster.

Problem 7. Perform amortized analysis of a stack that supports push (O(1)O(1)), pop (O(1)O(1)), and multipop(k) which pops kk elements (O(k)O(k), or O(min(k,size))O(\min(k, \mathrm{size}))).

Answer

Aggregate method. In a sequence of nn operations, each push adds one element. Each pop removes one element. Each multipop removes kk elements. Total elements removed \leq total elements pushed n\leq n. Total cost of all multipop and pop operations O(n)\leq O(n). Total push cost: O(n)O(n). Total: O(n)O(n). Amortised per operation: O(1)O(1).

Accounting method. Charge \2perpush:per push:$1forthepush,for the push,$1creditstoredwiththeelement.Popandmultipopusethestoredcredit( credit stored with the element. Pop and multipop use the stored credit ($1each),sotheiramortisedcostiseach), so their amortised cost is$0(theyre"free").Totalamortisedperoperation:(they're "free"). Total amortised per operation:O(1)$.

Potential method. Let Φ(S)=S\Phi(S) = |S| (number of elements on the stack).

  • Push: actual = 1, Φ\Phi increases by 1. Amortised = 1+1=21 + 1 = 2.
  • Pop: actual = 1, Φ\Phi decreases by 1. Amortised = 11=01 - 1 = 0.
  • Multipop(k): actual = kk', Φ\Phi decreases by kk'. Amortised = kk=0k' - k' = 0.

All amortised costs: O(1)O(1). \square

Problem 8. Determine the time complexity of the following recursive function:

def f(n):
if n <= 1:
return 1
return f(n - 1) + f(n - 1)
Answer

T(n)=2T(n1)+O(1)T(n) = 2T(n-1) + O(1), T(1)=O(1)T(1) = O(1).

This is not in Master Theorem form (requires T(n/b)T(n/b), not T(n1)T(n-1)).

Expanding: T(n)=2(2T(n2)+1)+1=4T(n2)+3=8T(n3)+7==2n1T(1)+(2n11)T(n) = 2(2T(n-2) + 1) + 1 = 4T(n-2) + 3 = 8T(n-3) + 7 = \cdots = 2^{n-1}T(1) + (2^{n-1} - 1).

T(n)=Θ(2n)T(n) = \Theta(2^n).

This is the Fibonacci-like recursion without memoization, leading to exponential time.

Problem 9. Rank the following functions in order of increasing growth rate: n0.5n^{0.5}, log2n\log^2 n, nlognn \log n, 2n2^n, n!n!, n3n^3, 2logn2^{\log n}.

Answer

First simplify: 2logn=n2^{\log n} = n (assuming log\log is base 2).

Growth rates (slowest to fastest):

log2n<n0.5<n=2logn<nlogn<n3<2n<n!\log^2 n < n^{0.5} < n = 2^{\log n} < n \log n < n^3 < 2^n < n!

Verification of selected orderings:

  • log2n=o(n0.5)\log^2 n = o(n^{0.5}): limLBlog2nRB◆◆LBnRB=0\lim \frac◆LB◆\log^2 n◆RB◆◆LB◆\sqrt{n}◆RB◆ = 0
  • n0.5=o(n)n^{0.5} = o(n): limLBnRB◆◆LBnRB=0\lim \frac◆LB◆\sqrt{n}◆RB◆◆LB◆n◆RB◆ = 0
  • n=o(nlogn)n = o(n \log n): limLBnRB◆◆LBnlognRB=0\lim \frac◆LB◆n◆RB◆◆LB◆n \log n◆RB◆ = 0
  • n3=o(2n)n^3 = o(2^n): limn32n=0\lim \frac{n^3}{2^n} = 0

Problem 10. Prove that O(f)+O(g)=O(max(f,g))O(f) + O(g) = O(\max(f, g)).

Answer

Proof. Let h1O(f)h_1 \in O(f) and h2O(g)h_2 \in O(g). Then c1,c2,n0\exists\, c_1, c_2, n_0 such that h1(n)c1f(n)h_1(n) \leq c_1 f(n) and h2(n)c2g(n)h_2(n) \leq c_2 g(n) for all nn0n \geq n_0.

Then: h1(n)+h2(n)c1f(n)+c2g(n)(c1+c2)max(f(n),g(n))h_1(n) + h_2(n) \leq c_1 f(n) + c_2 g(n) \leq (c_1 + c_2) \cdot \max(f(n), g(n)).

Let c=c1+c2c = c_1 + c_2. Then h1+h2cmax(f,g)h_1 + h_2 \leq c \cdot \max(f, g), so h1+h2O(max(f,g))h_1 + h_2 \in O(\max(f, g)). \square

Corollary. If f=O(g)f = O(g), then O(f+g)=O(g)O(f + g) = O(g).

For revision on specific algorithm complexities, see Sorting Algorithms and Searching Algorithms.


Problems

Problem 1. Determine the Big-O time complexity of the following code fragment:

for i in range(n):
for j in range(i):
print(i, j)
Hint

Count the total number of times the inner loop body executes. When i = 0, the inner loop runs 0 times. When i = 1, it runs 1 time. When i = 2, it runs 2 times. Sum this series.

Answer

The inner loop runs i times for each value of i from 0 to n−1.

Total iterations = 0+1+2++(n1)=n(n1)2=n2n20 + 1 + 2 + \cdots + (n-1) = \frac{n(n-1)}{2} = \frac{n^2 - n}{2}

The dominant term is n22\frac{n^2}{2}, so the time complexity is O(n2)O(n^2).

The 12\frac{1}{2} constant and the n-n term are dropped because Big-O ignores constant factors and lower-order terms.

Problem 2. Determine the Big-O time complexity of the following code fragment:

i = 1
while i < n:
for j in range(n):
print(j)
i *= 2
Hint

The outer loop variable i doubles each iteration (1, 2, 4, 8, ...). How many times does the outer loop run? How many times does the inner loop run per outer iteration?

Answer

Outer loop: i takes values 1, 2, 4, 8, ..., 2k<n2^k < n. The number of iterations is log2n\lceil \log_2 n \rceil.

Inner loop: Runs n times per outer iteration.

Total: n×log2n=O(nlogn)n \times \log_2 n = O(n \log n).

Problem 3. Algorithm A has time complexity O(nlogn)O(n \log n) with a constant factor of 10, and Algorithm B has time complexity O(n2)O(n^2) with a constant factor of 1. For approximately what values of nn is Algorithm A faster than Algorithm B?

Hint

Algorithm A performs approximately 10nlog2n10n \log_2 n operations and Algorithm B performs approximately n2n^2 operations. Solve 10nlog2n<n210n \log_2 n < n^2, which simplifies to 10log2n<n10 \log_2 n < n.

Answer

We need 10nlog2n<n210n \log_2 n < n^2, which simplifies to 10log2n<n10 \log_2 n < n (for n>0n > 0).

Testing values:

nn10log2n10 \log_2 nnnA faster?
1033.210No
2043.220No
3049.130No
4053.240No
5056.450No
6059.160Yes

Algorithm A becomes faster at approximately n=60n = 60.

This demonstrates that Big-O notation only describes asymptotic behaviour. For small inputs, the algorithm with the better Big-O complexity can be slower due to larger constant factors.

Problem 4. Compare the time complexities of the following pairs of algorithms and state which is more efficient asymptotically. Justify each answer: (a) O(n2)O(n^2) vs O(nlogn)O(n \log n), (b) O(2n)O(2^n) vs O(n3)O(n^3), (c) O(n)O(n) vs O(logn)O(\log n).

Hint

Use the limit test: if limnf(n)g(n)=0\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0, then f(n)=o(g(n))f(n) = o(g(n)), meaning ff grows strictly slower than gg.

Answer

(a) O(nlogn)O(n \log n) is more efficient than O(n2)O(n^2). limnLBnlognRB◆◆LBn2RB=limnLBlognRB◆◆LBnRB=0\lim_{n \to \infty} \frac◆LB◆n \log n◆RB◆◆LB◆n^2◆RB◆ = \lim_{n \to \infty} \frac◆LB◆\log n◆RB◆◆LB◆n◆RB◆ = 0. So nlogn=o(n2)n \log n = o(n^2) — it grows strictly slower.

(b) O(n3)O(n^3) is more efficient than O(2n)O(2^n). limnn32n=0\lim_{n \to \infty} \frac{n^3}{2^n} = 0 (exponential always dominates polynomial). So n3=o(2n)n^3 = o(2^n).

(c) O(logn)O(\log n) is more efficient than O(n)O(n). limnLBlognRB◆◆LBnRB=0\lim_{n \to \infty} \frac◆LB◆\log n◆RB◆◆LB◆n◆RB◆ = 0 (by L'Hôpital's rule or the known hierarchy). So logn=o(n)\log n = o(n).

Summary (most to least efficient): O(logn)O(n)O(nlogn)O(n3)O(2n)O(\log n) \subset O(n) \subset O(n \log n) \subset O(n^3) \subset O(2^n).

Problem 5. Analyze the space complexity of the following function:

def create_matrix(n):
matrix = []
for i in range(n):
row = [0] * (2 * n)
matrix.append(row)
return matrix
Hint

Count the total number of elements stored. The matrix has n rows, each containing 2 * n elements.

Answer

The function creates a matrix with:

  • n rows
  • Each row contains 2n elements (integers)
  • Total elements: n×2n=2n2n \times 2n = 2n^2

Each integer occupies O(1)O(1) space, and the list overhead is also O(1)O(1) per element.

Space complexity: O(n2)O(n^2).

The dominant factor is the total number of stored values (2n22n^2). The variable i, the row reference, and the loop overhead all use O(1)O(1) additional space and are negligible compared to the matrix itself.

Problem 6. Analyze the space complexity of the following recursive function:

def sum_recursive(arr, n):
if n <= 0:
return 0
return arr[n-1] + sum_recursive(arr, n-1)
Hint

Recursive functions use stack space proportional to their recursion depth. How deep does the recursion go, and how much space does each stack frame use?

Answer

Recursion depth: The function calls itself with n-1 until n <= 0. This means the maximum depth is n (from sum_recursive(arr, n) down to sum_recursive(arr, 0)).

Stack frame size: Each frame stores the parameters arr (a reference, O(1)O(1)) and n (an integer, O(1)O(1)), plus the return address and local variables — all O(1)O(1) per frame.

Total space: n×O(1)=O(n)n \times O(1) = O(n).

Space complexity: O(n)O(n) due to the call stack. The input array arr itself is not counted in the auxiliary space analysis since it is the input, not allocated by the function.

Note: An iterative version of this function would use only O(1)O(1) auxiliary space.

Problem 7. State the best-case, average-case, and worst-case time complexity of quicksort. For each case, describe the type of input that produces that complexity and explain why it occurs.

Hint

Quicksort's performance depends on how the pivot partitions the array. Consider what happens when the pivot is the median element, a random element, and the minimum or maximum element.

Answer
CaseComplexityInput conditionExplanation
BestO(nlogn)O(n \log n)Pivot always divides array in halfEach level processes all nn elements, and there are logn\log n levels. Total: O(nlogn)O(n \log n).
AverageO(nlogn)O(n \log n)Random inputsWith a random pivot, the expected split ratio is balanced. The recurrence T(n)=T(n/2)+T(n/2)+O(n)T(n) = T(n/2) + T(n/2) + O(n) holds on average.
WorstO(n2)O(n^2)Already sorted (with first/last pivot)If pivot is always the minimum or maximum, one partition has n1n-1 elements and the other has 0. Recurrence: T(n)=T(n1)+O(n)=O(n2)T(n) = T(n-1) + O(n) = O(n^2).

Mitigation: Randomised quicksort (choosing a random pivot) or median-of-three pivot selection makes the worst case extremely unlikely in practice, giving expected O(nlogn)O(n \log n) regardless of input.

Problem 8. Explain why binary search has O(logn)O(\log n) time complexity. In your answer, show how the search range halves at each step and derive the maximum number of iterations mathematically.

Hint

Start with a range of size nn. After each comparison, the range is reduced to at most half. After kk comparisons, the range size is at most n/2kn / 2^k. When does this become less than 1?

Answer

Intuitive argument: Binary search halves the search range at each step:

  • After 1 comparison: range ≤ n/2n/2
  • After 2 comparisons: range ≤ n/4n/4
  • After 3 comparisons: range ≤ n/8n/8
  • After kk comparisons: range ≤ n/2kn/2^k

The algorithm terminates when the range has fewer than 1 element:

n2k<1    2k>n    k>log2n\frac{n}{2^k} < 1 \implies 2^k > n \implies k > \log_2 n

So the maximum number of iterations is log2n+1=O(logn)\lfloor \log_2 n \rfloor + 1 = O(\log n).

Formal derivation using the Master Theorem:

T(n)=T(n/2)+O(1),T(1)=O(1)T(n) = T(n/2) + O(1), \quad T(1) = O(1)

Here a=1a = 1, b=2b = 2, c=log21=0c = \log_2 1 = 0. Since f(n)=O(1)=O(n0)=O(nc)f(n) = O(1) = O(n^0) = O(n^c), this is Master Theorem Case 2 with k=0k = 0:

T(n)=Θ(nclogk+1n)=Θ(logn)T(n) = \Theta(n^c \log^{k+1} n) = \Theta(\log n)

Why log n is efficient: log2n\log_2 n grows extremely slowly. For n=109n = 10^9 (one billion), log2n30\log_2 n \approx 30. This means binary search finds any element in a sorted billion-element array with at most 30 comparisons.

Problem 9. Determine the Big-O time complexity of the following recursive function:

def recursive_func(n):
if n <= 0:
return 0
x = 0
for i in range(n):
x += i
return x + recursive_func(n // 2)
Hint

Set up the recurrence relation. The function does O(n)O(n) work (the for loop) and then calls itself with n/2n/2. This gives T(n)=T(n/2)+O(n)T(n) = T(n/2) + O(n). Apply the Master Theorem.

Answer

Recurrence relation:

T(n)=T(n/2)+O(n),T(0)=O(1)T(n) = T(n/2) + O(n), \quad T(0) = O(1)

The O(n)O(n) term comes from the for loop that iterates n times. The recursive call passes n//2n // 2.

Applying the Master Theorem: a=1a = 1, b=2b = 2, c=log21=0c = \log_2 1 = 0.

f(n)=O(n)=O(n0+1)=Ω(nc+ϵ)f(n) = O(n) = O(n^{0+1}) = \Omega(n^{c + \epsilon}) where ϵ=1\epsilon = 1.

Check regularity condition: af(n/b)=1(n/2)=n/2cna \cdot f(n/b) = 1 \cdot (n/2) = n/2 \leq c \cdot n for c=1/2<1c = 1/2 < 1. ✓

This is Case 3 of the Master Theorem: T(n)=Θ(f(n))=Θ(n)T(n) = \Theta(f(n)) = \Theta(n).

Verification by expansion:

T(n)=n+T(n/2)=n+n/2+T(n/4)=n+n/2+n/4++1T(n) = n + T(n/2) = n + n/2 + T(n/4) = n + n/2 + n/4 + \cdots + 1

=n(1+12+14+)=n2=2n=O(n)= n\left(1 + \frac{1}{2} + \frac{1}{4} + \cdots\right) = n \cdot 2 = 2n = O(n)

Time complexity: O(n)O(n).

Problem 10. (Exam-style) Two algorithms solve the same problem — counting inversions in an array (the number of pairs (i, j) where i < j but A[i] > A[j]).

Algorithm P uses a brute-force approach:

def algorithm_p(arr):
result = 0
for i in range(len(arr)):
for j in range(i + 1, len(arr)):
if arr[i] > arr[j]:
result += 1
return result

Algorithm Q uses a modified merge sort that counts inversions during the merge step.

(a) Determine the time complexity of each algorithm. (b) For n=10000n = 10\,000, estimate the number of operations for each algorithm. (c) Which algorithm is more efficient and by what factor? Explain your reasoning.

Hint

For Algorithm P, count the number of iterations of the nested loops. For Algorithm Q, recall that merge sort is O(nlogn)O(n \log n) and the modification to count inversions doesn't change the asymptotic complexity.

Answer

(a) Time complexity:

Algorithm P: The outer loop runs nn times (i = 0 to n−1). For each i, the inner loop runs from i+1 to n−1, which is n1in - 1 - i iterations.

Total iterations: i=0n1(n1i)=(n1)+(n2)++1+0=n(n1)2\sum_{i=0}^{n-1}(n - 1 - i) = (n-1) + (n-2) + \cdots + 1 + 0 = \frac{n(n-1)}{2}

Time complexity of P: O(n2)O(n^2).

Algorithm Q: Modified merge sort follows the same divide-and-conquer structure as standard merge sort. Each level processes all nn elements during merging, and there are logn\log n levels. The inversion count is accumulated during the merge step without changing the merge complexity.

Time complexity of Q: O(nlogn)O(n \log n).

(b) Estimated operations for n=10000n = 10\,000:

AlgorithmFormulaOperations
Pn(n1)/2n(n-1)/210000×9999/24999500010\,000 \times 9\,999 / 2 \approx 49\,995\,000
Qnlog2nn \log_2 n10000×13.313300010\,000 \times 13.3 \approx 133\,000

(c) Comparison:

Algorithm Q is significantly more efficient. The ratio is:

LBn2/2RB◆◆LBnlognRB=LBnRB◆◆LB2log2nRB=LB10000RB◆◆LB2×13.3RB376\frac◆LB◆n^2/2◆RB◆◆LB◆n \log n◆RB◆ = \frac◆LB◆n◆RB◆◆LB◆2 \log_2 n◆RB◆ = \frac◆LB◆10\,000◆RB◆◆LB◆2 \times 13.3◆RB◆ \approx 376

Algorithm Q is approximately 376 times faster than Algorithm P for n=10000n = 10\,000.

Reasoning: The difference grows with nn. For n=1000000n = 1\,000\,000, Algorithm P would perform 5×1011\approx 5 \times 10^{11} operations while Algorithm Q performs 20000000\approx 20\,000\,000 — a factor of 25,000×. This demonstrates the critical importance of choosing algorithms with better asymptotic complexity for large inputs.

:::