Understanding ConcurrentHashMap in Kotlin/Java
Introduction
In multi-threaded programming, managing data consistency and thread safety is paramount. A common challenge is efficiently managing a shared resource without compromising on performance. In Java and Kotlin, the ConcurrentHashMap
is a key tool for achieving this balance. This article will explore what ConcurrentHashMap
is, its use cases, internal workings, best practices, pros, and cons, all explained in a manner accessible to beginners.
What is ConcurrentHashMap?
ConcurrentHashMap
is a thread-safe variant of the HashMap
designed for concurrent access by multiple threads. Unlike HashMap
, which can be corrupted by concurrent modifications, ConcurrentHashMap
allows safe and efficient read/write operations in a multi-threaded environment without locking the entire map.
Use Case
ConcurrentHashMap
is ideal for scenarios where:
Multiple threads frequently read and write to the map.
High performance is required without sacrificing thread safety.
Lock contention must be minimized.
Common use cases include:
Caching shared data in web applications.
Maintaining session information in multi-threaded servers.
Real-time analytics where data is constantly updated.
How ConcurrentHashMap Works Under the Hood
To understand ConcurrentHashMap
, it's essential to grasp its underlying mechanisms:
Segmentation
Earlier versions of
ConcurrentHashMap
(before Java 8) used segmentation. The map was divided into segments, each acting as an independentHashMap
with its own lock. This reduced contention because only the segment containing the key had to be locked during updates.
Lock Stripping
In Java 8,
ConcurrentHashMap
introduced lock stripping, which uses finer-grained locking. It employs a technique called "lock-free reading," allowing reads without acquiring locks. Writes are managed using a combination of locks and CAS (Compare-And-Swap) operations to ensure thread safety.
Internal Structure
ConcurrentHashMap
uses an array ofNode<K, V>
objects. Each node contains a key, value, hash, and a reference to the next node (for handling collisions). The main operations (get, put, remove) work as follows:
Get Operation:
Calculate the hash of the key.
Locate the appropriate bin (index in the array).
Traverse the linked list at that bin to find the key.
Return the corresponding value without acquiring a lock.
Put Operation:
Calculate the hash of the key.
Locate the appropriate bin.
Acquire a lock on the bin.
Check if the key exists; if so, update the value.
If the key doesn't exist, add a new node.
Release the lock.
Remove Operation:
Calculate the hash of the key.
Locate the appropriate bin.
Acquire a lock on the bin.
Traverse the linked list to find the key.
Remove the node if the key is found.
Release the lock.
CAS (Compare-And-Swap)
CAS is an atomic operation used to update variables without locks. It checks if a variable has a specific value and, if so, updates it to a new value atomically. ConcurrentHashMap
uses CAS to implement non-blocking updates for some operations, reducing the need for locks and improving performance.
Examples of How ConcurrentHashMap Works
To fully grasp how ConcurrentHashMap
operates under the hood, let's look at some detailed examples in Kotlin for the main operations: get
, put
, and remove
.
1. Get Operation
The get
operation retrieves a value associated with a key. This operation is lock-free, allowing concurrent reads without blocking.
Example:
import java.util.concurrent.ConcurrentHashMap
fun main() {
val map = ConcurrentHashMap<String, Int>()
map["apple"] = 1
map["banana"] = 2
val key = "apple"
val hash = key.hashCode() // Calculate the hash of the key
val index = hash and (map.size - 1) // Determine the bin index
// Retrieve the value without acquiring a lock
val value = map[key]
println("Value for key '$key': $value")
}
Explanation:
The hash of the key "apple" is calculated.
The bin index is determined by applying a bitwise AND operation between the hash and the map's size minus one.
The value is retrieved by accessing the appropriate bin without acquiring a lock.
2. Put Operation
The put
operation adds a key-value pair to the map. This operation may involve acquiring a lock on the bin to ensure thread safety during updates.
Example:
import java.util.concurrent.ConcurrentHashMap
fun main() {
val map = ConcurrentHashMap<String, Int>()
val key = "orange"
val value = 3
val hash = key.hashCode() // Calculate the hash of the key
val index = hash and (map.size - 1) // Determine the bin index
// Acquire a lock on the bin
synchronized(map) {
// Check if the key already exists
if (map.containsKey(key)) {
map[key] = value // Update the value
} else {
map[key] = value // Add a new key-value pair
}
}
println("Value for key '$key': ${map[key]}")
}
Explanation:
The hash of the key "orange" is calculated.
The bin index is determined similarly to the get operation.
A lock is acquired on the bin to ensure thread safety.
The key-value pair is added to the map, or the existing value is updated.
3. Remove Operation
The remove
operation deletes a key-value pair from the map. This operation also involves acquiring a lock on the bin.
Example:
import java.util.concurrent.ConcurrentHashMap
fun main() {
val map = ConcurrentHashMap<String, Int>()
map["apple"] = 1
map["banana"] = 2
val key = "banana"
val hash = key.hashCode() // Calculate the hash of the key
val index = hash and (map.size - 1) // Determine the bin index
// Acquire a lock on the bin
synchronized(map) {
// Remove the key-value pair if it exists
if (map.containsKey(key)) {
map.remove(key)
}
}
println("Value for key '$key' after removal: ${map[key]}")
}
Explanation:
The hash of the key "banana" is calculated.
The bin index is determined similarly to the previous operations.
A lock is acquired on the bin to ensure thread safety.
The key-value pair is removed if it exists.
CAS (Compare-And-Swap) Operation
ConcurrentHashMap
uses CAS operations for some updates to avoid locking. This is an atomic operation that helps in maintaining thread safety with better performance.
Example:
import java.util.concurrent.ConcurrentHashMap
import java.util.concurrent.atomic.AtomicInteger
fun main() {
val map = ConcurrentHashMap<String, AtomicInteger>()
map["apple"] = AtomicInteger(1)
val key = "apple"
val newValue = 2
// CAS operation to update the value
val currentValue = map[key]
if (currentValue != null) {
currentValue.compareAndSet(currentValue.get(), newValue)
}
println("Value for key '$key': ${map[key]?.get()}")
}
Explanation:
An
AtomicInteger
is used to hold the value associated with the key "apple".The
compareAndSet
method performs the CAS operation, updating the value atomically if it matches the expected current value.
Best Practices
To make the most of ConcurrentHashMap
, follow these best practices:
Use ConcurrentHashMap for High Concurrency:
- Ideal for scenarios with frequent reads and writes by multiple threads.
Avoid Excessive Locking:
- Minimize custom synchronization on the map, as
ConcurrentHashMap
handles it internally.
- Minimize custom synchronization on the map, as
Use Appropriate Methods:
- Use methods designed for concurrency, such as
putIfAbsent
,remove(key, value)
, andcomputeIfAbsent
.
- Use methods designed for concurrency, such as
Limit Size:
- Avoid letting the map grow indefinitely. Consider using an eviction policy for large maps.
Thread-safe Iteration:
- Use
ConcurrentHashMap
's iterators and methods likeforEach
,search
, andreduce
for thread-safe operations.
- Use
Pros and Cons
Pros
Thread Safety:
- Safe for concurrent access without additional synchronization.
Performance:
- High performance due to reduced lock contention and efficient algorithms.
Scalability:
- Scales well with increasing number of threads.
Ease of Use:
- Simple to use with intuitive methods for common operations.
Cons
Memory Overhead:
- Slightly higher memory usage compared to
HashMap
due to additional structures for thread safety.
- Slightly higher memory usage compared to
Complexity:
- More complex internal implementation, which might be overkill for single-threaded scenarios.
Inefficiency for Single-threaded Use:
- Not as efficient as
HashMap
for single-threaded applications due to additional overhead.
- Not as efficient as
Example in Kotlin
Here’s a simple example to illustrate how to use ConcurrentHashMap
in Kotlin:
import java.util.concurrent.ConcurrentHashMap
fun main() {
val map = ConcurrentHashMap<String, Int>()
// Add elements to the map
map["apple"] = 1
map["banana"] = 2
// Retrieve an element
println("Value for key 'apple': ${map["apple"]}")
// Update an element
map["apple"] = 3
println("Updated value for key 'apple': ${map["apple"]}")
// Remove an element
map.remove("banana")
println("Value for key 'banana' after removal: ${map["banana"]}")
// Compute if absent
map.computeIfAbsent("orange") { 4 }
println("Value for key 'orange': ${map["orange"]}")
}
Conclusion
ConcurrentHashMap
is a powerful tool for managing shared data in multi-threaded applications. By understanding its internal workings and following best practices, you can leverage its capabilities to
achieve thread-safe and high-performance data management. While it has some drawbacks, the benefits far outweigh them in scenarios requiring concurrent access. Use ConcurrentHashMap
wisely, and it will serve as a robust solution for your concurrent programming needs.
That's it for today. Happy Coding...
Subscribe to my newsletter
Read articles from Romman Sabbir directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Romman Sabbir
Romman Sabbir
Senior Android Engineer from Bangladesh. Love to contribute in Open-Source. Indie Music Producer.