Advanced HashMap Series: Thread Safety in HashMaps

HarshavardhananHarshavardhanan
7 min read

Intro — The Unseen Collision Between Threads and HashMaps

We spend so much time picking the right data structures for performance,
but not enough time thinking about what threads will do to them.

HashMap is a tool we all reach for.
And threads — whether it’s background jobs, async tasks, schedulers, or just good old ThreadPoolExecutor — are everywhere in modern systems.

So it’s easy to overlook the fact that these two don’t mix.
Or rather, they do — but badly.

This post is not just about thread-safe alternatives.
It’s about what really happens when you use HashMap without protection,
and why something as simple as map.put() in the wrong context can quietly sabotage your system.

This is the final post in the HashMap series. Let’s go all the way in.


The Problem: HashMap Under Concurrency

Let’s say you run this:

Map<String, String> map = new HashMap<>();

Runnable writer = () -> {
    for (int i = 0; i < 10_000; i++) {
        map.put("key" + i, "value" + i);
    }
};

Thread t1 = new Thread(writer);
Thread t2 = new Thread(writer);

t1.start();
t2.start();
t1.join();
t2.join();

What could go wrong?

  • Sometimes nothing.

  • Sometimes missing keys.

  • Sometimes a ConcurrentModificationException.

  • In older Java versions: infinite loops.


Why HashMap Breaks in Multi-threaded Code

HashMap Resizing Is Not Atomic

When the number of entries crosses capacity * loadFactor, HashMap resizes.
It:

  1. Allocates a new bucket array

  2. Rehashes existing keys into the new array

  3. Relinks the buckets

Now imagine two threads triggering resize at the same time —
and both modifying the bucket linked list without coordination.

Result?

  • Data loss

  • Loops in the linked list (next → next → ... → same node again)

  • Infinite while (e != null) e = e.next

  • And no easy stacktrace to blame

This happened in production apps before Java 8. That’s not theoretical — that’s legacy pain.


What is and isn’t thread-safe in a plain HashMap?

OperationThread-Safe?Why
Multiple reads✅ Technically safeBut may read stale data
Read + write❌ UnsafeReaders may miss updates or see partial states
Multiple writes❌ UnsafeCan corrupt internal structure, especially during resize

The bottom line: HashMap is not designed for concurrent access.
And neither are LinkedHashMap, TreeMap, HashSet, or LinkedHashSet.


The Three Ways to Make a Map Thread-Safe

1. Collections.synchronizedMap()

Map<String, String> map = Collections.synchronizedMap(new HashMap<>());

✅ Simple
❌ Coarse-grained — locks the entire map on every read/write
❌ You must manually lock the map during iteration:

synchronized(map) {
    for (Map.Entry<String, String> e : map.entrySet()) {
        // safe iteration
    }
}

Good for retrofitting, not for high-performance systems.


2. Manual Synchronization

Map<String, String> map = new HashMap<>();

synchronized (map) {
    map.put("key", "value");
}

✅ Fine-grained control
❌ Easy to misuse
❌ Doesn’t scale across large codebases or teams
❌ Becomes brittle with compound logic

You can make it safe this way, but the cost is high and the risk of subtle bugs is higher.


3. ConcurrentHashMap — Purpose-Built for Concurrency

Java’s ConcurrentHashMap isn’t a patched-up HashMap. It’s a fundamentally different data structure built to withstand concurrency, high throughput, and minimal locking overhead.

ConcurrentHashMap<String, Integer> inventory = new ConcurrentHashMap<>();

inventory.put("Apples", 100);
inventory.computeIfPresent("Apples", (k, v) -> v - 10);

✅ Key Features:

  • Lock-free reads via volatile memory access

  • Fine-grained locking only at the bucket level (not global)

  • Built-in atomic operations like putIfAbsent, computeIfAbsent, merge, replace

  • Safe during concurrent writes, even during resizes

  • Does not allow null keys or values to avoid ambiguity in concurrent state

This is the default map you reach for when:

  • Multiple threads read/write the same map

  • You want high throughput without managing your own locks

  • You care about correctness under load


HashMap vs ConcurrentHashMap — What’s Different?

FeatureHashMapConcurrentHashMap
Thread-safe❌ No✅ Yes
Locking granularityNonePer-bucket for writes, lock-free for reads
Resize behaviorCan cause corruptionCoordinated safely with minimal blocking
Allows null keys/values✅ Yes❌ No
Fail-fast iteration❌ Yes❌ No — weakly consistent
Atomic operations❌ Needs manual sync✅ Built-in: compute, merge, etc.
Performance under contention❌ Poor✅ High

Internals: Java 7 vs Java 8 ConcurrentHashMap

Java 7 — Segment-Based Design

In Java 7, ConcurrentHashMap was implemented using a fixed number of segments (default: 16).
Each segment was a mini-HashMap with its own lock.

Pros:

  • Allowed up to 16 concurrent write operations (if distributed well)

  • Fine for moderate concurrency loads

Cons:

  • Segment count was fixed — couldn’t scale with workload

  • Collisions within the same segment still caused lock contention

  • Additional memory overhead per segment


Java 8+ — Lock-Free Reads, CAS Writes, and TreeBins

Java 8 rewrote ConcurrentHashMap from scratch:

  • Removed segments

  • Introduced a flat table (like HashMap)

  • Introduced lock-free reads using volatile + memory visibility guarantees

  • Used CAS (Compare-And-Swap) to avoid locks wherever possible

  • Fallback to fine-grained synchronized locking per bucket (bin) only when needed

Node<K,V>[] table; // shared bucket array

// Lock-free get
final V get(Object key) {
    ...
}

Bonus: TreeBins (Red-Black Trees)
When a single bucket gets too many keys (high collision), it's treeified to improve worst-case performance — similar to HashMap.


Summary of Java 8+ Internals

OperationStrategy
get()Lock-free (volatile read of node array)
put()CAS on bucket node if possible; fallback to synchronized
ResizingCoordinated — multiple threads can help resize
High-collision binTreeified into a Red-Black Tree
IteratorsWeakly consistent — no ConcurrentModificationException

Why ConcurrentHashMap Disallows Nulls

You can't do this:

map.put(null, "value");     // ❌
map.put("key", null);       // ❌

Here’s why:

if (map.get(key) == null) {
    map.put(key, value);
}

This breaks under concurrency:

  • Did .get() return null because the key doesn’t exist?

  • Or because the value is null?

  • What if another thread is in the middle of a .put()?

Ambiguity = race condition.
Disallowing null forces clarity — and forces you to use atomic APIs like putIfAbsent() or computeIfAbsent().


Atomic Operations You Should Use

// 1. Lazy initialization (safe)
map.computeIfAbsent("user123", k -> loadUser());

// 2. Thread-safe counters
map.merge("api:/v1", 1, Integer::sum);

// 3. Replace conditionally
map.replace("id123", "oldVal", "newVal");

Each of these is atomic. You don’t need to synchronize them.


API Differences at a Glance

MethodHashMapConcurrentHashMap
put()❌ Not thread-safe✅ Per-bucket locking
get()❌ Unsafe under concurrent writes✅ Lock-free
putIfAbsent()❌ Needs manual check✅ Atomic
computeIfAbsent()❌ Not safe✅ Atomic
replace()❌ Not atomic✅ Compare-and-set under the hood
merge()❌ Manual logic✅ Built-in, lock-safe
entrySet().iterator()❌ Fail-fast✅ Weakly consistent

Real-World Use Cases for ConcurrentHashMap

ScenarioWhy It Fits
User session storeHigh read/write volume with minimal blocking
CachingcomputeIfAbsent() avoids duplicate loads
Request counters / rate limitingmerge() or compute() keeps updates atomic
Visited URL tracking in crawlersFast and safe insert-if-not-exist
Shared game state (real-time multiplayer)Lock-free reads, safe mutations
Analytics maps (event → count)Accumulate values without contention

Summary — Use the Right Tool, Safely

  • HashMap is great for single-threaded access or when you handle locking yourself.

  • Collections.synchronizedMap() is fine for basic needs but hits performance walls fast.

  • ConcurrentHashMap is the only one on this list actually designed to survive concurrency.

Thread safety isn’t just about not crashing.
It’s about maintaining trust in your data under pressure.


Closing the Series

This wraps up the HashMap Deep Dive Series.
We’ve looked at everything from collisions and rehashing to real-world misuse and now thread safety.

If there’s one takeaway to carry forward, it’s this:

A data structure is not just about how it stores — it’s about how it survives.

Thanks for following along.

0
Subscribe to my newsletter

Read articles from Harshavardhanan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harshavardhanan
Harshavardhanan