Internal Working of ConcurrentHashMap in Java: A Practical Guide.
Table of contents
ConcurrentHashMap
is a powerful class in Java's java.util.concurrent
package designed for concurrent programming. It provides a thread-safe implementation of a hash map, allowing multiple threads to read and write to the map concurrently without the need for external synchronization. Understanding how ConcurrentHashMap
works internally can help developers make better use of this class and optimize their multithreaded applications.
In this guide, we will delve into the internal workings of ConcurrentHashMap
, explore its design, and demonstrate how it ensures thread safety while maintaining high performance.
1. Introduction to ConcurrentHashMap
ConcurrentHashMap
was introduced in Java 5 and has become a fundamental part of concurrent programming in Java. Unlike a regular HashMap
, which is not thread-safe, ConcurrentHashMap
allows multiple threads to perform read and write operations without the need for external synchronization, ensuring data consistency and improved performance.
It is designed to scale efficiently in multi-threaded environments, where high concurrency is expected, and provides better throughput by reducing contention (locking conflicts).
2. Key Features of ConcurrentHashMap
Thread Safety:
ConcurrentHashMap
ensures that multiple threads can perform read and write operations concurrently without causing data corruption or inconsistency.Lock Striping: Instead of locking the entire map,
ConcurrentHashMap
locks smaller segments (buckets) of the map, allowing multiple threads to work on different segments simultaneously.Non-blocking Reads: Read operations are non-blocking, allowing multiple threads to read data concurrently without interference from write operations.
Atomic Operations: Supports atomic operations like
putIfAbsent()
,compute()
, andmerge()
, which are useful for safe concurrent updates to the map.
3. Internal Structure of ConcurrentHashMap
Segmentation (Buckets)
Internally, ConcurrentHashMap
is divided into several segments or "buckets," each of which is a smaller HashMap
. The map is partitioned into segments to allow multiple threads to work on different parts of the map concurrently without locking the entire map.
Default Segment Size: Initially,
ConcurrentHashMap
is divided into 16 segments (buckets) by default. This allows up to 16 threads to concurrently access different segments without conflict.Rehashing: If the number of elements exceeds a certain threshold, the map automatically increases the number of segments by a factor of 2 to maintain performance.
Locking Mechanism (Lock Striping)
Instead of locking the entire map for a read or write operation, ConcurrentHashMap
uses lock striping to reduce contention. Each segment of the map has its own lock, so multiple threads can concurrently operate on different segments.
Segment Locks: Each segment is independently locked. For example, if two threads try to update elements in different segments, they will not block each other.
Fine-Grained Locking: The locking is fine-grained, meaning that only the affected segment is locked, which improves performance over traditional global locks.
CAS (Compare-And-Swap) Operation
ConcurrentHashMap
uses the CAS (Compare-And-Swap) operation, a low-level atomic operation that allows updates to a variable without locking. It ensures that a thread only updates a value if it hasn't been changed by another thread in the meantime.
For example, when a thread updates a key-value pair in the map, it checks whether the value has been modified by another thread since the thread started the update. If the value was changed, the update is discarded, and the thread retries the operation.
4. Concurrency Level and Performance
The concurrency level determines how many segments ConcurrentHashMap
will use to partition the map. By default, the concurrency level is set to 16, which means the map will have 16 segments. The higher the concurrency level, the more segments the map will have, and the better it will scale with more threads.
Default Concurrency Level: 16
Maximum Concurrency Level: 2^30 (approximately 1 billion segments)
In high-concurrency environments, increasing the concurrency level can improve performance, but using too many segments may increase memory overhead and reduce efficiency.
5. Internal Operations and Methods
put()
, get()
, remove()
, containsKey()
put(): Adds a key-value pair to the map. If the key already exists, it updates the value. The operation is thread-safe and works with segment locking and CAS operations.
ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>(); map.put("apple", 1);
get(): Retrieves the value associated with a key. Read operations are non-blocking and do not require locking.
Integer value = map.get("apple");
remove(): Removes a key-value pair from the map. The operation is thread-safe and uses CAS to ensure correctness.
map.remove("apple");
containsKey(): Checks if a key exists in the map. The operation is thread-safe and does not require locking.
boolean exists = map.containsKey("apple");
Special Methods
putIfAbsent()
: Adds a key-value pair if the key is not already present in the map. This is an atomic operation that ensures that the value is only inserted if it is absent, and it is thread-safe.map.putIfAbsent("banana", 2);
compute()
: Atomically computes a new value for a given key using the provided function, which is useful for updating values based on their current state.map.compute("apple", (key, value) -> value == null ? 1 : value + 1);
merge()
: Merges a key-value pair into the map, applying a remapping function if the key already exists.map.merge("apple", 2, (oldVal, newVal) -> oldVal + newVal);
6. Practical Examples and Usage
Example 1: Basic Operations with ConcurrentHashMap
import java.util.concurrent.ConcurrentHashMap;
public class ConcurrentHashMapExample {
public static void main(String[] args) {
ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();
// Adding elements
map.put("apple", 1);
map.put("banana", 2);
// Accessing values
System.out.println(map.get("apple"));
// Removing an element
map.remove("banana");
// Updating elements
map.putIfAbsent("orange", 3);
// Using compute() to update the value
map.compute("apple", (key, value) -> value + 1);
System.out.println(map);
}
}
Example 2: Using putIfAbsent()
for Thread-Safe Insertion
import java.util.concurrent.ConcurrentHashMap;
public class PutIfAbsentExample {
public static void main(String[] args) {
ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();
// Thread 1
new Thread(() -> map.putIfAbsent("apple", 10)).start();
// Thread 2
new Thread(() -> map.putIfAbsent("apple", 20)).start();
// Only the first put will succeed, as the second will be ignored
System.out.println(map.get("apple")); // Should print 10
}
}
7. Advantages and Use Cases
Advantages:
Concurrency: Multiple threads can perform operations on different segments concurrently, making it ideal for high-concurrency scenarios.
No Global Locking: Unlike synchronized collections,
ConcurrentHashMap
does not require a global lock, which improves performance in concurrent access scenarios.Atomic Operations: It supports atomic methods for safe and efficient updates.
Use Cases:
Caching: Often used in caching scenarios where threads need to read and write cache data concurrently.
Distributed Systems: Ideal for managing shared resources across multiple threads in distributed systems.
Real-time Applications: Used in systems requiring high performance and low latency, such as real-time analytics.
8. Limitations and Considerations
Memory Usage: Although
ConcurrentHashMap
performs well in highly concurrent environments, it may consume more memory due to internal partitioning.Not Suitable for All Use Cases: For simple use cases where synchronization is not a concern, regular
HashMap
with manual synchronization might be simpler and faster.Atomicity of Non-Atomic Operations: While operations like
putIfAbsent()
are atomic, complex operations requiring multiple actions across multiple keys should still be handled carefully.
9. Conclusion
ConcurrentHashMap
is a highly efficient and scalable data structure for concurrent programming in Java. By leveraging segmentation, fine-grained locking, and atomic operations, it allows safe concurrent read and write operations across multiple threads. Understanding its internal structure and operations can help developers harness its full potential in building high-performance, thread-safe applications.
By carefully considering the use cases and limitations, developers can optimize their applications for concurrency, ensuring both thread safety and performance.
More such articles:
https://www.youtube.com/@maheshwarligade
Subscribe to my newsletter
Read articles from Maheshwar Ligade directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Maheshwar Ligade
Maheshwar Ligade
Learner, Love to make things simple, Full Stack Developer, StackOverflower, Passionate about using machine learning, deep learning and AI