Mastering String Encoding and Decoding Techniques


Real-Life Example: Sending a Secret List
Imagine you are working on a messaging app like WhatsApp or Discord. You want to send a list of messages (strings) like:
["hello", "world", "I", "am", "awesome"]
But there’s a problem: networks don’t work directly. They only send strings!
So, how can we turn this into a single string, send it, and rebuild the original list on the other side?
That’s exactly what the “Encode and Decode a String Array“ problem is about.
Problem Statement
Design an algorithm to encode a list of strings to a single string.
Then, design the decoding function that reconstructs the list from that single string.
Example:
Input: ["leet", "Code", "is", "cool"]
Encoded: "4#leet4#code2#is4#cool"
Decoded: ["leet", "code", "is", "cool"]
Why This Problem Matters
This problem shows up in the interview at Meta(Facebook), Google, Amazon, and more because it tests:
Your understanding of string manipulation
Your ability to design custom formats
Your skill at thinking about edge cases and decoding
It’s also extremely practical for real-world systems that serialize and deserialize data like:
Sending data over API’s or sockets
Storing string in databases
Building chat apps or test processing tools
Naive Approach (and Why It Fails)
You might think: “Why not just join the strings with a comma?”
','.join(["Hello", "World"]) -> "Hello,World"
But what if a string contains a comma?
["Hello", "wor,ld"] -> "Hello,wor,ld" ❌ Confusing
We can’t tell how to split it back!
So… we need a robust, unambiguous way to encode and decode.
Smart Approach: Length-Prefix Encoding
We’ll encode each string as:
<length>#<string>
So [“Hello“, “world“] becomes:
5#hello5#world
We use #
as a delimiter because it won’t appear in the length (which is just a number). During decoding, we read the length, skip the #
, and then read the exact number of characters.
Step-by-Step Breakdown
Encode Function
For each string:
Get its length.
Add length +
#
+ string.
Combine everything into one long string.
def encode(strs): res = "" for s in strs: res += str(len(s)) + '#' + s return res
Decode Function
Loop through the string.
Find the length by scanning until you see
#
.Read the next
length
characters.Repeat until done.
We’ll encode each string as:
def decode(s):
res = []
i = 0
while i < len(s):
j = i
while s[j] != "#":
j += 1
length = int(s[i:j])
res.append(s[j+1:j+1+length])
i = j + 1 + length
return res
Test Example
input_data = ["hello", "world", "python"]
encoded = encode(input_data)
print(encoded) #Output: 5#hello5#World6#python
decoded = decode(encoded)
print(decoded) #Output: ["Hello", "world", "python"]
Works perfectly — even if the strings have special characters, punctuation, or even #
Time and Space Complexity
Operation | Time Complexity |
Encoding | O(N) |
Decoding | O(N) |
Where N
is the total number of characters in all strings combined.
Why is this important?
→ It’s linear and fast, which matters when you process millions of strings.
→ No extra libraries. No tricky edge cases. Clean and reliable.
Key Takeaways
Avoid naive
.join()
When the delimiter might appear inside your strings.Use length-prefix encoding to build reliable encoders/decoders.
It trains you to think like a system engineer. What can go wrong? How do you prevent it?, etc…
Subscribe to my newsletter
Read articles from Sam Anirudh Malarvannan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
