Moms are Hashmapping Geniuses - Part 1: HashMap, Hashing, Collision and an Organized Kitchen


The Kitchen Where It All Began
Iโve used HashMap
for years.Put a key, get a value. Fast. Clean. Easy.
I never really thought about how it works. It justโฆ worked.
Then one day, I was helping my mom organize our kitchen. We had a lot of drawers โ each loosely assigned to different things.
โTea bags go in drawer 3 as usual,โ I said, confidently.
But drawer 3 was already full of tea bags.
I turned to my mom and asked, โWhere should I put these now?โ
She casually replied,
โIf 3 is full, just put them in 2 or 4 โ Iโll still check nearby.โ
I blinked.
Something about that logic felt... oddly smart.
At the time, I didnโt think much of it.
But later that night, I kept coming back to that moment. The way she handled overflow. The fallback plan. The "check nearby" rule.
And out of nowhere, it hit me:
โWaitโฆ did my mom just HashMap in real life?โ
That thought sent me spiraling into late-night docs, internal implementation guides and some very nerdy GitHub issues.
And what I found completely changed how I thought about one of Javaโs most-used data structures.
What is a HashMap, Really?
Letโs say you have a cupboard full of drawers, and each drawer is meant to hold a specific item. You want a system where you can store something and instantly find it later โ no labels, no search effort.
You tag every item with a key. And based on that key, the cupboard automatically figures out which drawer it should go into.
Later, when you provide the same key again, it instantly retrieves the correct item from the exact drawer โ no scan, no guesswork.
Thatโs essentially how a HashMap
functions: a data structure that provides constant-time complexity โ O(1) on average โ for inserting, retrieving, and deleting entries using a computed index.
You give it a key โ it stores your value in a calculated slot. Give the same key again โ it finds that slot and gives the value back. Instantly!!!
Behind the scenes, it uses something called hashing to determine where each item should go.
What is Hashing?
Hashing is the process by which a HashMap
calculates the storage index โ or bucket โ for a key-value pair. It's the secret sauce behind those O(1) lookups.
You provide a key. Internally, the system runs it through a hash function to compute the drawer (bucket) number โ a mathematical operation that maps your key to an integer representing the slot.
A simplified version might look like this:
Y = K % N
Where:
K
is the key (typically an integer or hashed string)N
is the total number of drawers (hash table size)Y
is the resulting index (bucket number)
Letโs assume the drawer count N
is 10. Here's how a few keys get mapped:
Key | Hash Function | Result |
17 | 17 % 10 | 7 |
23 | 23 % 10 | 3 |
34 | 34 % 10 | 4 |
51 | 51 % 10 | 1 |
67 | 67 % 10 | 7 |
At first glance, this works perfectly. Each key lands in a precise, predictable drawer. Retrieval becomes instant using the same deterministic formula.
Hashing enables constant-time access โ and that's the whole game. The better the distribution of keys across buckets, the more performant your map remains.
Or so it seemsโฆ
But Why Does Something Feel Odd?
Take a closer look.
Both key 17
and key 67
get mapped to drawer 7
.
Theyโre completely different values โ but they collide at the same location.
This situation is known as a collision:
Two distinct keys are assigned to the same bucket by the hash function.
Collisions are not rare edge cases โ theyโre an expected reality. Anytime you try to map a wide range of inputs (keys) into a limited number of outputs (buckets), overlaps are inevitable.
Which means the system needs to make an internal decision: How do I store both values without overwriting one?
So the real question becomes:
How does a
HashMap
gracefully handle collisions โ and still maintain its performance guarantees?
In other words: how does it stay fast โ even when things go wrong?
With the help of collision handling strategies.
Collision Handling Strategies
๐ 1. Linear Probing
One of the simplest techniques is Linear Probing. When a collision occurs, the map simply checks the next available bucket.
If that's full too? It checks the next one โ wrapping around to the start if needed โ until it finds an empty spot.
๐ฝ Kitchen Analogy:
Drawer 7 is full with sugar (key 17). Salt (key 67) also hashes to 7. Mom peeks into drawer 8. If 8 is full, she checks 9, 10, then loops back to 0, 1...
๐ Linear Probing Table Example:
Letโs say the table size is 10
. We try to insert the following keys:
Key | Hash (key % 10) | Final Position |
17 | 7 | 7 |
67 | 7 โ 8 | 8 |
97 | 7 โ 8 โ 9 | 9 |
27 | 7 โ 8 โ 9 โ 0 | 0 |
๐ง Internal View:
Simple to implement and understand
Efficient cache performance due to locality of reference
โ Cons:
Suffers from primary clustering โ items with same hash crowd together
Insertions can slow down under high load
๐ 2. Chaining
In Chaining, each bucket contains a list or chain of entries. If multiple keys hash to the same bucket, theyโre added to the list at that location.
๐ฝ Kitchen Analogy:
Drawer 7 already has sugar. Now salt also belongs to 7. Mom adds a divider and keeps both in the same drawer โ labeled neatly.
๐งฐ Bucket Representation:
Hereโs what drawer 7 would look like if we used chaining:
Bucket 7:
โโโโโโโโโโโโโโโโโ
โ 17 | Sugar โ
โโโโโโโโโโโโโโโโโค
โ 67 | Salt โ
โโโโโโโโโโโโโโโโโค
โ 97 | Jam โ
โโโโโโโโโโโโโโโโโ
๐ Under the Hood:
Each entry is stored as a (key, value)
pair, often wrapped in a linked list:
7 โ [ 17 | Sugar ] โ [ 67 | Salt ] โ [ 97 | Jam ]
During retrieval, the map walks through the list, comparing keys using .equals()
until it finds the match.
โ Pros:
Easy to implement and visualize
Load factor can exceed 1
โ Cons:
Worst-case lookup degrades to O(n)
Extra memory for linked lists
๐ 3. Double Hashing
Double Hashing applies a second hash function to compute a jump interval, reducing clustering and giving each key a more unique probe sequence.
๐ฝ Kitchen Analogy:
Drawer 7 is full. Instead of checking 8, 9... Mom says, "Letโs jump by 3 drawers." If thatโs full, she jumps another 3. The jump size depends on the item.
๐งฎ Calculation:
hash1 = key % size;
hash2 = prime - (key % prime);
index = (hash1 + i * hash2) % size;
Example:
Assume: size = 10
, prime = 7
Key | hash1 | hash2 | Probes |
17 | 7 | 4 | 7 |
67 | 7 | 3 | 7 โ 0 |
97 | 7 | 1 | 7 โ 8 |
107 | 7 | 5 | 7 โ 2 |
Final Table:
0 โ 67 โ Salt
2 โ 107 โ Jam
7 โ 17 โ Sugar
8 โ 97 โ Coffee Beans
โ Pros:
Reduces clustering significantly
No linked list or extra memory
โ Cons:
More complex logic
Needs a good secondary hash to avoid cycles
๐ง Real Talk: Moms donโt do double hashing.
They donโt need to. Their drawers are organized from day one.
We engineers? We create the chaos and then invent skip logic to fix it.
Cleaning Up โ But Not Quite Done Yet
So far, weโve built a pretty solid kitchen system:
A way to calculate where to store things (
hashing
)Strategies to handle drawer collisions (
linear probing
,chaining
,double hashing
)
Honestly, for many use cases โ thatโs enough.
Put an item, get it back later. Boom. HashMap magic.
But thenโฆ
One day you open your kitchen drawerโฆ and itโs full.
Not just one drawer โ everything is full.
You try putting one more item inโฆ and suddenly, the system collapses.
Lookups slow down. Everything feels stuck. The magic is gone.
Thatโs when you realize: Itโs not just about putting things in the right place. Itโs also about knowing when to grow.
Ever notice how your mom's kitchen never seems to run out of space?
No matter how many new packets arrive โ thereโs always room. Always structure.
First we saw how moms are hashing geniuses...
Next, we'll see why theyโre rehashing geniuses too.
Up Next:
Moms are Hashmapping Geniuses โ Part 2
Rehashing, Load Factor and an Ever Expanding Kitchen
Because when the kitchen gets full, a good system doesn't panic.
It reshuffles and comes back stronger.
Subscribe to my newsletter
Read articles from Harshavardhanan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
