Number Representation
Table of Contents
1. Binary
Binary is a system of storing data using just two digits: 1 and 0. Everything in a computer is ultimately stored in binary (with high voltage being 1, and low voltage being 0), so we need a way to represent any piece of data using binary.
Binary data has no meaning in and of itself; we must assign meaning to each bitstring, and read each bitstring as if it was that particular meaning. For example, the bitstring 0b00111001 can represent 56 if we interpret it as an integer, or the character '9' if we interpret it as ASCII.
A system to data as binary is known as a representation scheme. As long as we use a consistent scheme to read and write this data, our program will work (even if outside users don’t know our scheme). Ideally, schemes should:
- represent data that’s relevant to whatever we’re making
- store things efficiently
- optimal memory usage
- efficient operations
- be intuitive for humans to understand (usually)
1.1. Decimal Conversions
An analogous system is decimal. Decimal numbers are stored using ten digits, zero through nine. Binary is the same, except we only get two digits: the number 0b1010 is equivalent to \(1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 0 \times 2^0\). Since every integer can be uniquely expressed as a sum of powers of 2, we can write any number in binary.
Since \(2^{10} = 1024\) is extremely close to 1000, we can approximate large powers of 2 in decimal. For example, we can write:
\begin{align} 2^{32} = 2^{30} \times 2^2 = (2^{10})^3 \times 4 \approx 4 \text{ billion} \notag \end{align}The actual number is \(4294967296\), which is approximately 7% off from our approximation.
There are also binary SI prefixes: kibi, mebi, gibi intsead of kilo, mega, giga. For example, 512 TiB is called 512 Tebibytes, which is equal to \(512 \times 2^{40}\) bytes.
2. Hexadecimal
Hexadecimal is a system of stroing numbers using 16 digits: 0123456789ABCDEF. Hexadecimal numbers are often prefixed with 0x to distinguish them from decimal numbers. In practice, hex numbers is often used as a proxy for binary.
Since base 16 is \(2^4\), translation between binary and hex is much easier. For example, the hex number 0x87 can be converted to binary by converting each digit separately to binary and then concatenating the results.
3. Representing Integers
The problem is, there’s infinitely many integers. Making a representation that could represent any integer would require infinitely many bits.
One solution is to treat an integer as an array of digits and extend the array as needed (e.g. Python’s large integer primitive). However, this would require variable amounts of time to compute operations.
Or, we can only allow for numbers up to a max value (fixed bitlength). This is often referred to as an n-bit unsigned integer, where we allow numbers from \(0\) to \(2^n - 1\) (e.g. an 8-bit unsigned int goes from \(0\) to \(255\)). To write an n-bit number, we always write out the full n bits, including leading zeroes.
4. Boolean Values
We can use binary to represent the boolean states: 1 to mean TRUE, and 0 to mean FALSE. We define the operators OR, AND, NOT, and XOR for boolean values:
| x | y | x OR y |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
| x | y | x AND y |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
| x | y | x XOR y |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| x | NOT x |
|---|---|
| 0 | 1 |
| 1 | 0 |
5. Bitwise Operations
Bitwise operations are operators designed to work directly with bits. In C, these are:
| Operator | Operation |
|---|---|
& |
bitwise AND |
| |
bitwise OR |
^ |
bitwise XOR |
~ |
bitwise NOT |
<< |
left shift |
>> |
right shift |