Number Systems
1. Positional Number Systems
Definition
A positional number system represents a number as a sequence of digits in base , where the value of the number is:
Each digit satisfies . The most significant digit (MSD) is and the least significant digit (LSD) is .
The bases relevant to A Level Computer Science are:
| Base | Name | Digits Used |
|---|---|---|
| 10 | Denary | 0, 1, 2, ..., 9 |
| 2 | Binary | 0, 1 |
| 8 | Octal | 0, 1, 2, ..., 7 |
| 16 | Hex | 0–9, A, B, C, D, E, F |
We use subscript notation to denote the base: .
Conversion Between Bases
Base to Denary
Algorithm. Given digits in base , compute:
Correctness. This is exactly the definition of positional notation, so correctness is immediate.
Proof of termination. The sum has exactly terms, so the algorithm terminates after iterations.
Example: Convert to denary
Denary to Base (Repeated Division)
Algorithm. To convert denary to base :
- Set
- While : digit ; set ;
- The result is (digits collected in reverse order)
Correctness proof. We prove by induction on the number of division steps that the algorithm produces the correct base- representation.
Base case. After the first step, and . We have , so is indeed the coefficient of .
Inductive step. Assume after steps we have where . The next step computes and , giving . Substituting:
This maintains the invariant. When , the representation is complete.
Example: Convert to binary
| Step | |||
|---|---|---|---|
| 1 | 156 | 0 | 78 |
| 2 | 78 | 0 | 39 |
| 3 | 39 | 1 | 19 |
| 4 | 19 | 1 | 9 |
| 5 | 9 | 1 | 4 |
| 6 | 4 | 0 | 2 |
| 7 | 2 | 0 | 1 |
| 8 | 1 | 1 | 0 |
Reading bottom to top:
Binary to Hexadecimal (and vice versa)
Since , each hex digit corresponds to exactly 4 binary digits. Group binary digits from right to left in groups of 4, then convert each group.
Example: Convert to hex
Group:
Pad left group:
Convert:
Hexadecimal to Binary
Replace each hex digit with its 4-bit binary equivalent.
Example: Convert to binary
, ,
Result:
Octal Conversions
Since , each octal digit maps to exactly 3 binary digits. Convert by grouping in 3s (or multiplying/dividing by 8).
2. Binary Arithmetic
Binary Addition
We add bitwise from right to left, with carries:
| Sum | ||||
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 | 0 |
| 0 | 1 | 0 | 1 | 0 |
| 0 | 1 | 1 | 0 | 1 |
| 1 | 0 | 0 | 1 | 0 |
| 1 | 0 | 1 | 0 | 1 |
| 1 | 1 | 0 | 0 | 1 |
| 1 | 1 | 1 | 1 | 1 |
Proposition (Carry propagation). The carry into bit position depends only on bits through .
Proof. The carry is defined recursively: and . By structural induction on , is a Boolean function of only.
Example: Add
Carry: 1 1 1 0 0
1 0 1 1
+ 0 1 1 0
---------
1 0 0 0 1
, , . Correct. ✓
Binary Subtraction
Binary subtraction can be performed using two's complement (see below).
3. Two's Complement Representation
Motivation
We need a way to represent both positive and negative integers in binary, using a fixed number of bits.
Definition
For an -bit two's complement representation, the range of representable integers is:
The representation of a non-negative integer is simply its standard -bit binary representation. The representation of a negative integer (where ) is:
Derivation: Why Works
Theorem. For any -bit positive integer (), the two's complement representation of equals (where is the bitwise NOT of the -bit representation of , and is binary addition).
Proof. The -bit representation of has bits . The bitwise complement has bits , where . The value of as an unsigned -bit number is:
Therefore:
This is exactly the definition of the two's complement of .
Corollary. Adding and in two's complement yields zero (modulo ).
Proof.
In bits, is represented as with a carry out of bit position , which is discarded. Hence the result is .
Two's Complement Addition and Overflow
When adding two -bit two's complement numbers, the result is correct (modulo ) if and only if no overflow occurs.
Overflow detection rules:
| Condition | Overflow? |
|---|---|
| Positive + Positive = Negative | Yes |
| Negative + Negative = Positive | Yes |
| Positive + Negative | Never |
| Same signs produce same sign result | No |
Formally: overflow occurs if and only if the carry into the MSB differs from the carry out of the MSB.
Example: Add and in 4-bit two's complement
(since )
Carry: 1 1 1 1 0
1 0 1 1
+ 0 0 1 1
---------
1 1 1 0
in two's complement: . Let us verify: . ✓
No overflow (negative + positive never overflows).
Example: Overflow — Add and in 4-bit two's complement
,
Carry: 0 1 1 0 0
0 1 1 0
+ 0 1 0 1
---------
1 0 1 1
Result: in two's complement. But , which is outside the range for 4 bits. Overflow detected: positive + positive yielded negative. ✓
info
- AQA: Requires two's complement for 8-bit and 16-bit numbers.
- CIE: Requires two's complement for 8-bit numbers specifically.
- OCR: Requires understanding of sign and magnitude as well as two's complement.
4. Fixed-Point Binary Representation
Definition
A fixed-point binary number uses a specified number of bits for the integer part and a specified number of bits for the fractional part.
For an -bit number with integer bits and fractional bits ():
where is the LSB of the integer part and is the MSB.
Range and Precision
For an unsigned fixed-point number with integer bits and fractional bits:
- Range:
- Precision (resolution):
For signed (two's complement) with integer bits and fractional bits:
- Range:
Example: 8-bit fixed-point with 5 integer bits and 3 fractional bits
:
Integer part: Fractional part:
Value:
Range: , Precision:
5. Binary Coded Decimal (BCD)
Definition
In Binary Coded Decimal (BCD), each decimal digit (0–9) is represented by its 4-bit binary equivalent.
| Decimal | BCD |
|---|---|
| 0 | 0000 |
| 1 | 0001 |
| 2 | 0010 |
| ... | ... |
| 9 | 1001 |
The codes through are invalid in BCD.
Properties
- Each decimal digit requires exactly 4 bits (a nibble)
- A -digit decimal number requires bits in BCD
- BCD is less space-efficient than pure binary: e.g., requires 12 bits in BCD but only 10 bits in pure binary ()
- BCD avoids rounding errors in decimal arithmetic — useful in financial systems
Pitfall BCD is NOT the same as converting the entire number to binary. in BCD is , NOT .
6. Character Encoding
ASCII
The American Standard Code for Information Interchange (ASCII) is a 7-bit character encoding standard representing 128 characters:
- –: Control characters (NUL, BEL, LF, CR, etc.)
- : Space
- –: Digits '0'–'9'
- –: Uppercase letters 'A'–'Z'
- –: Lowercase letters 'a'–'z'
Key property: The codes for uppercase and lowercase letters differ by exactly (), so bit 5 distinguishes them. Specifically, 'A' = and 'a' = .
Extended ASCII uses 8 bits (256 characters) but is not standardised — various extensions exist.
Unicode
Unicode is a universal character encoding standard that aims to represent every character from every writing system. Key facts:
- Unicode assigns a unique code point to each character, written as
U+XXXX(e.g., 'A' =U+0041, '€' =U+20AC) - As of Unicode 15.0, there are over 149,000 characters across 161 scripts
- Unicode is an abstract standard — it defines code points, not how they are stored in bytes
UTF-8 Encoding
UTF-8 is a variable-length encoding of Unicode code points into bytes. It is backward-compatible with ASCII.
Encoding rules:
| Code Point Range | Binary Pattern | Bytes |
|---|---|---|
U+0000 – U+007F | 0xxxxxxx | 1 |
U+0080 – U+07FF | 110xxxxx 10xxxxxx | 2 |
U+0800 – U+FFFF | 1110xxxx 10xxxxxx 10xxxxxx | 3 |
U+10000 – U+10FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | 4 |
Properties of UTF-8:
- Self-synchronising: Any byte can identify whether it is a single-byte character, the start of
a multi-byte sequence, or a continuation byte. Continuation bytes always start with
10, start bytes start with11...0. - ASCII compatible: All ASCII text is valid UTF-8.
- Prefix-free: No valid UTF-8 sequence is a prefix of another valid sequence.
Example: Encode '€' (U+20AC) in UTF-8
This falls in the range U+0800–U+FFFF, so it uses 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx.
Fill in the bits from the code point:
Split:
Result: 11100010 10000010 10101100 = E2 82 AC in hex.
7. Representing Negative Numbers: Other Methods
Sign and Magnitude
The MSB represents the sign (0 = positive, 1 = negative), and the remaining bits represent the magnitude.
- Range for bits:
- Two representations of zero: and
Pitfall Sign and magnitude is rarely used in practice because:
- It has two representations of zero
- Addition requires different logic depending on the signs
- The range is asymmetric
Problem Set
Problem 1. Convert to denary.
Hint
Use the positional formula with . Remember .
Answer
Problem 2. Convert to binary and hexadecimal.
Hint
Use repeated division by 2, then group into 4-bit nibbles for hex.
Answer
Binary:
Hex: Group as
Problem 3. Represent in 8-bit two's complement. Verify by adding it to and showing the result is zero.
Hint
Find the binary of 42, flip all bits, add 1.
Answer
Bitwise NOT:
Add 1:
Verification: . The carry out of 8 bits is discarded, leaving . ✓
Problem 4. Perform the addition and interpret the result in 8-bit unsigned and 8-bit two's complement.
Hint
Do the binary addition first. Then interpret based on the encoding scheme.
Answer
1 0 1 1 0 1 1 0
+ 0 1 1 0 1 0 1 1
-----------------
1 0 0 1 0 0 0 0 1
8-bit result (discard carry):
- Unsigned: . Check: . But . ✓
- Two's complement: . Check: . ✓
Problem 5. A fixed-point system uses 12 bits: 8 for the integer part and 4 for the fractional part. What is the range and precision? Convert to denary.
Hint
Precision is . Range depends on whether signed or unsigned.
Answer
Unsigned range: , precision: ().
Value:
Problem 6. Encode the string "Hi!" in ASCII (hex). Then explain how many bytes this would take in UTF-8.
Hint
Look up each character's ASCII code point.
Answer
'H' = 72 = = 01001000 'i' = 105 = = 01101001 '!' = 33 = =
00100001
In ASCII hex: 48 69 21
Since all code points are below U+007F, UTF-8 uses 1 byte per character, so 3 bytes total. UTF-8
encoding is identical to ASCII for these characters.
Problem 7. Encode the character 'ñ' (U+00F1) in UTF-8. Show each step.
Hint
U+00F1 falls in the 2-byte range U+0080–U+07FF. Use the pattern 110xxxxx 10xxxxxx.
Answer
Split into bits:
First byte: 110 + 00011 = 11000011 = Second byte: 10 + 110001 = 10110001 =
UTF-8: C3 B1
Problem 8. A student claims that in 4-bit two's complement represents . Another claims it represents . Who is correct? Explain.
Hint
What does the MSB tell you in two's complement?
Answer
The first student is correct for two's complement interpretation. The MSB is 1, so it is negative.
Value: . ✓
The second student is interpreting it as an unsigned number: . This is also valid — the bit pattern is the same, but the interpretation differs. Context determines which encoding is used.
Problem 9. Prove that the two's complement of the two's complement of equals (for in the valid range, excluding ).
Hint
Start with . Apply the operation again.
Answer
Let .
. ✓
The exception is , whose two's complement is , which exceeds bits. In -bit arithmetic, , which is the bit pattern — the same as . So is its own two's complement.
Problem 10. In a system using 6-bit two's complement, what is the result of adding and ? Does overflow occur?
Hint
Range of 6-bit two's complement is . Compute . Is this in range?
Answer
1 1 0 0 0 0
+ 1 1 0 0 0 1
-----------
1 1 0 0 0 0 1
6-bit result: . This is .
Check: . No overflow: is in range . ✓
Both inputs are negative and the result is negative, so no overflow.
:::
:::
:::