Internal Working of ConcurrentHashMap in Java: A Practical Guide.

ConcurrentHashMap is a powerful class in Java's java.util.concurrent package designed for concurrent programming. It provides a thread-safe implementation of a hash map, allowing multiple threads to read and write to the map concurrently without the need for external synchronization. Understanding how ConcurrentHashMap works internally can help developers make better use of this class and optimize their multithreaded applications.

In this guide, we will delve into the internal workings of ConcurrentHashMap, explore its design, and demonstrate how it ensures thread safety while maintaining high performance.

1. Introduction to ConcurrentHashMap

ConcurrentHashMap was introduced in Java 5 and has become a fundamental part of concurrent programming in Java. Unlike a regular HashMap, which is not thread-safe, ConcurrentHashMap allows multiple threads to perform read and write operations without the need for external synchronization, ensuring data consistency and improved performance.

It is designed to scale efficiently in multi-threaded environments, where high concurrency is expected, and provides better throughput by reducing contention (locking conflicts).


2. Key Features of ConcurrentHashMap

  • Thread Safety: ConcurrentHashMap ensures that multiple threads can perform read and write operations concurrently without causing data corruption or inconsistency.

  • Lock Striping: Instead of locking the entire map, ConcurrentHashMap locks smaller segments (buckets) of the map, allowing multiple threads to work on different segments simultaneously.

  • Non-blocking Reads: Read operations are non-blocking, allowing multiple threads to read data concurrently without interference from write operations.

  • Atomic Operations: Supports atomic operations like putIfAbsent(), compute(), and merge(), which are useful for safe concurrent updates to the map.


3. Internal Structure of ConcurrentHashMap

Segmentation (Buckets)

Internally, ConcurrentHashMap is divided into several segments or "buckets," each of which is a smaller HashMap. The map is partitioned into segments to allow multiple threads to work on different parts of the map concurrently without locking the entire map.

  • Default Segment Size: Initially, ConcurrentHashMap is divided into 16 segments (buckets) by default. This allows up to 16 threads to concurrently access different segments without conflict.

  • Rehashing: If the number of elements exceeds a certain threshold, the map automatically increases the number of segments by a factor of 2 to maintain performance.

Locking Mechanism (Lock Striping)

Instead of locking the entire map for a read or write operation, ConcurrentHashMap uses lock striping to reduce contention. Each segment of the map has its own lock, so multiple threads can concurrently operate on different segments.

  • Segment Locks: Each segment is independently locked. For example, if two threads try to update elements in different segments, they will not block each other.

  • Fine-Grained Locking: The locking is fine-grained, meaning that only the affected segment is locked, which improves performance over traditional global locks.

CAS (Compare-And-Swap) Operation

ConcurrentHashMap uses the CAS (Compare-And-Swap) operation, a low-level atomic operation that allows updates to a variable without locking. It ensures that a thread only updates a value if it hasn't been changed by another thread in the meantime.

For example, when a thread updates a key-value pair in the map, it checks whether the value has been modified by another thread since the thread started the update. If the value was changed, the update is discarded, and the thread retries the operation.


4. Concurrency Level and Performance

The concurrency level determines how many segments ConcurrentHashMap will use to partition the map. By default, the concurrency level is set to 16, which means the map will have 16 segments. The higher the concurrency level, the more segments the map will have, and the better it will scale with more threads.

  • Default Concurrency Level: 16

  • Maximum Concurrency Level: 2^30 (approximately 1 billion segments)

In high-concurrency environments, increasing the concurrency level can improve performance, but using too many segments may increase memory overhead and reduce efficiency.


5. Internal Operations and Methods

put(), get(), remove(), containsKey()

  • put(): Adds a key-value pair to the map. If the key already exists, it updates the value. The operation is thread-safe and works with segment locking and CAS operations.

      ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();
      map.put("apple", 1);
    
  • get(): Retrieves the value associated with a key. Read operations are non-blocking and do not require locking.

      Integer value = map.get("apple");
    
  • remove(): Removes a key-value pair from the map. The operation is thread-safe and uses CAS to ensure correctness.

      map.remove("apple");
    
  • containsKey(): Checks if a key exists in the map. The operation is thread-safe and does not require locking.

      boolean exists = map.containsKey("apple");
    

Special Methods

  • putIfAbsent(): Adds a key-value pair if the key is not already present in the map. This is an atomic operation that ensures that the value is only inserted if it is absent, and it is thread-safe.

      map.putIfAbsent("banana", 2);
    
  • compute(): Atomically computes a new value for a given key using the provided function, which is useful for updating values based on their current state.

      map.compute("apple", (key, value) -> value == null ? 1 : value + 1);
    
  • merge(): Merges a key-value pair into the map, applying a remapping function if the key already exists.

      map.merge("apple", 2, (oldVal, newVal) -> oldVal + newVal);
    

6. Practical Examples and Usage

Example 1: Basic Operations with ConcurrentHashMap

import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentHashMapExample {
    public static void main(String[] args) {
        ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();

        // Adding elements
        map.put("apple", 1);
        map.put("banana", 2);

        // Accessing values
        System.out.println(map.get("apple"));

        // Removing an element
        map.remove("banana");

        // Updating elements
        map.putIfAbsent("orange", 3);

        // Using compute() to update the value
        map.compute("apple", (key, value) -> value + 1);

        System.out.println(map);
    }
}

Example 2: Using putIfAbsent() for Thread-Safe Insertion

import java.util.concurrent.ConcurrentHashMap;

public class PutIfAbsentExample {
    public static void main(String[] args) {
        ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();

        // Thread 1
        new Thread(() -> map.putIfAbsent("apple", 10)).start();

        // Thread 2
        new Thread(() -> map.putIfAbsent("apple", 20)).start();

        // Only the first put will succeed, as the second will be ignored
        System.out.println(map.get("apple"));  // Should print 10
    }
}

7. Advantages and Use Cases

Advantages:

  • Concurrency: Multiple threads can perform operations on different segments concurrently, making it ideal for high-concurrency scenarios.

  • No Global Locking: Unlike synchronized collections, ConcurrentHashMap does not require a global lock, which improves performance in concurrent access scenarios.

  • Atomic Operations: It supports atomic methods for safe and efficient updates.

Use Cases:

  • Caching: Often used in caching scenarios where threads need to read and write cache data concurrently.

  • Distributed Systems: Ideal for managing shared resources across multiple threads in distributed systems.

  • Real-time Applications: Used in systems requiring high performance and low latency, such as real-time analytics.


8. Limitations and Considerations

  • Memory Usage: Although ConcurrentHashMap performs well in highly concurrent environments, it may consume more memory due to internal partitioning.

  • Not Suitable for All Use Cases: For simple use cases where synchronization is not a concern, regular HashMap with manual synchronization might be simpler and faster.

  • Atomicity of Non-Atomic Operations: While operations like putIfAbsent() are atomic, complex operations requiring multiple actions across multiple keys should still be handled carefully.


9. Conclusion

ConcurrentHashMap is a highly efficient and scalable data structure for concurrent programming in Java. By leveraging segmentation, fine-grained locking, and atomic operations, it allows safe concurrent read and write operations across multiple threads. Understanding its internal structure and operations can help developers harness its full potential in building high-performance, thread-safe applications.

By carefully considering the use cases and limitations, developers can optimize their applications for concurrency, ensuring both thread safety and performance.

More such articles:

https://medium.com/techwasti

https://www.youtube.com/@maheshwarligade

https://techwasti.com/series/spring-boot-tutorials

https://techwasti.com/series/go-language

0
Subscribe to my newsletter

Read articles from Maheshwar Ligade directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Maheshwar Ligade
Maheshwar Ligade

Learner, Love to make things simple, Full Stack Developer, StackOverflower, Passionate about using machine learning, deep learning and AI