Base32 Encoding - The Basics

Cloud TunedCloud Tuned
4 min read

Base32 Encoding

Base32 encoding is a method for converting binary data into a text format using a set of 32 different characters. It is used to encode binary data in situations where text-based formats are required, such as in URLs, file names, or data exchange formats. Base32 encoding is more space-efficient than Base64 for some applications, though it results in a larger output compared to Base64.

Key Aspects of Base32 Encoding

1. Character Set:

- Base32 uses a set of 32 characters: A-Z and 2-7. This choice avoids ambiguous characters like 1 and 0, which can be confused with I and O.

- The alphabet is specifically chosen to be case-insensitive.

2. How It Works:

- Base32 encoding divides the binary data into groups of 5 bits and maps each group to one of the 32 characters.

- Each Base32 character represents 5 bits of data.

- Padding with = characters is used to ensure the final output is a multiple of 8 characters.

3. Padding:

- If the input data is not a multiple of 5 bytes, the output is padded with one to seven = characters to make it a multiple of 8 characters.

- This padding ensures proper decoding back to the original binary data.

Base32 Encoding Process

1. Convert Input Data to Binary:

- For example, the ASCII string "Hi" is represented in binary as:

H: 01001000 i: 01101001

2. Group the Binary Data into 5-bit Chunks:

- Combine the 8-bit bytes into one sequence and then divide it into 5-bit groups:

01001 00001 10100 1

- If necessary, pad the final group with zero bits to make it a full 5-bit group.

3. Map the 5-bit Groups to Base32 Characters:

- Using the Base32 alphabet, map each 5-bit group to the corresponding Base32 character:

01001 (9): J 00001 (1): B 10100 (20): U 10000 (16): Q (padded)

- The encoded string for "Hi" is "JBSQ====".

Example

Let's encode the string "Hello" using Base32:

1. Convert to Binary:

H: 01001000 e: 01100101 l: 01101100 l: 01101100 o: 01101111

2. Group into 5-bit Chunks:

01001 00001 10010 10101 10110 11000 11011 01111

3. Pad the Data:

- The binary data is padded with zero bits to make the last group 5 bits long:

01001 00001 10010 10101 10110 11000 11011 01111

4. Map to Base32 Characters:

01001 (9): J 00001 (1): B 10010 (18): I 10101 (21): V 10110 (22): W 11000 (24): Y 11011 (27): 3 01111 (15): P

- The encoded string for "Hello" is "JBSWY3DP".

Base32 Alphabet Table

+-------+------+-------+------+-------+------+-------+------+
| Value | Char | Value | Char | Value | Char | Value | Char |
+-------+------+-------+------+-------+------+-------+------+
| 0     | A    | 8     | I    | 16    | Q    | 24    | Y    |
| 1     | B    | 9     | J    | 17    | R    | 25    | Z    |
| 2     | C    | 10    | K    | 18    | S    | 26    | 2    |
| 3     | D    | 11    | L    | 19    | T    | 27    | 3    |
| 4     | E    | 12    | M    | 20    | U    | 28    | 4    |
| 5     | F    | 13    | N    | 21    | V    | 29    | 5    |
| 6     | G    | 14    | O    | 22    | W    | 30    | 6    |
| 7     | H    | 15    | P    | 23    | X    | 31    | 7    |
+-------+------+-------+------+-------+------+-------+------+

Applications of Base32 Encoding

1. URL Safe Encoding: Base32 is often used for encoding data in URLs and file names because it avoids characters that have special meanings in these contexts.

2. Data Encoding: Used in various applications such as encoding cryptographic keys, tokens, and identifiers where case-insensitivity and readability are important.

3. TOTP (Time-based One-Time Password): Base32 is commonly used to encode shared secrets in two-factor authentication systems like Google Authenticator.

Decoding Base32

To decode a Base32 encoded string, the process is reversed:

1. Replace each Base32 character with its 5-bit binary representation.

2. Group the bits into 8-bit bytes.

3. Convert the bytes back to the original binary data.

Example in Python

Here's a simple example of encoding and decoding using Python:

import base64

# Encode
original_data = b"Hello"
encoded_data = base64.b32encode(original_data)
print(encoded_data)  # Output: b'JBSWY3DP'

# Decode
decoded_data = base64.b32decode(encoded_data)
print(decoded_data)  # Output: b'Hello'

In summary, Base32 encoding is a useful technique for converting binary data into a text format that is safe for URLs, filenames, and other text-based formats. It ensures that binary data can be transmitted and stored in environments that only support text, while also being case-insensitive and avoiding ambiguous characters.

0
Subscribe to my newsletter

Read articles from Cloud Tuned directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Cloud Tuned
Cloud Tuned