Understanding ConcurrentHashMap in Kotlin/Java

Introduction

In multi-threaded programming, managing data consistency and thread safety is paramount. A common challenge is efficiently managing a shared resource without compromising on performance. In Java and Kotlin, the ConcurrentHashMap is a key tool for achieving this balance. This article will explore what ConcurrentHashMap is, its use cases, internal workings, best practices, pros, and cons, all explained in a manner accessible to beginners.

What is ConcurrentHashMap?

ConcurrentHashMap is a thread-safe variant of the HashMap designed for concurrent access by multiple threads. Unlike HashMap, which can be corrupted by concurrent modifications, ConcurrentHashMap allows safe and efficient read/write operations in a multi-threaded environment without locking the entire map.

Use Case

ConcurrentHashMap is ideal for scenarios where:

Multiple threads frequently read and write to the map.
High performance is required without sacrificing thread safety.
Lock contention must be minimized.

Common use cases include:

Caching shared data in web applications.
Maintaining session information in multi-threaded servers.
Real-time analytics where data is constantly updated.

How ConcurrentHashMap Works Under the Hood

To understand ConcurrentHashMap, it's essential to grasp its underlying mechanisms:

Segmentation

Earlier versions of ConcurrentHashMap (before Java 8) used segmentation. The map was divided into segments, each acting as an independent HashMap with its own lock. This reduced contention because only the segment containing the key had to be locked during updates.

Lock Stripping

In Java 8, ConcurrentHashMap introduced lock stripping, which uses finer-grained locking. It employs a technique called "lock-free reading," allowing reads without acquiring locks. Writes are managed using a combination of locks and CAS (Compare-And-Swap) operations to ensure thread safety.

Internal Structure

ConcurrentHashMap uses an array of Node<K, V> objects. Each node contains a key, value, hash, and a reference to the next node (for handling collisions). The main operations (get, put, remove) work as follows:

Get Operation:
- Calculate the hash of the key.
- Locate the appropriate bin (index in the array).
- Traverse the linked list at that bin to find the key.
- Return the corresponding value without acquiring a lock.
Put Operation:
- Calculate the hash of the key.
- Locate the appropriate bin.
- Acquire a lock on the bin.
- Check if the key exists; if so, update the value.
- If the key doesn't exist, add a new node.
- Release the lock.
Remove Operation:
- Calculate the hash of the key.
- Locate the appropriate bin.
- Acquire a lock on the bin.
- Traverse the linked list to find the key.
- Remove the node if the key is found.
- Release the lock.

CAS (Compare-And-Swap)

CAS is an atomic operation used to update variables without locks. It checks if a variable has a specific value and, if so, updates it to a new value atomically. ConcurrentHashMap uses CAS to implement non-blocking updates for some operations, reducing the need for locks and improving performance.

Examples of How ConcurrentHashMap Works

To fully grasp how ConcurrentHashMap operates under the hood, let's look at some detailed examples in Kotlin for the main operations: get, put, and remove.

1. Get Operation

The get operation retrieves a value associated with a key. This operation is lock-free, allowing concurrent reads without blocking.

Example:

import java.util.concurrent.ConcurrentHashMap

fun main() {
    val map = ConcurrentHashMap<String, Int>()
    map["apple"] = 1
    map["banana"] = 2

    val key = "apple"
    val hash = key.hashCode() // Calculate the hash of the key
    val index = hash and (map.size - 1) // Determine the bin index

    // Retrieve the value without acquiring a lock
    val value = map[key]

    println("Value for key '$key': $value")
}

Explanation:

The hash of the key "apple" is calculated.
The bin index is determined by applying a bitwise AND operation between the hash and the map's size minus one.
The value is retrieved by accessing the appropriate bin without acquiring a lock.

2. Put Operation

The put operation adds a key-value pair to the map. This operation may involve acquiring a lock on the bin to ensure thread safety during updates.

Example:

import java.util.concurrent.ConcurrentHashMap

fun main() {
    val map = ConcurrentHashMap<String, Int>()
    val key = "orange"
    val value = 3

    val hash = key.hashCode() // Calculate the hash of the key
    val index = hash and (map.size - 1) // Determine the bin index

    // Acquire a lock on the bin
    synchronized(map) {
        // Check if the key already exists
        if (map.containsKey(key)) {
            map[key] = value // Update the value
        } else {
            map[key] = value // Add a new key-value pair
        }
    }

    println("Value for key '$key': ${map[key]}")
}

Explanation:

The hash of the key "orange" is calculated.
The bin index is determined similarly to the get operation.
A lock is acquired on the bin to ensure thread safety.
The key-value pair is added to the map, or the existing value is updated.

3. Remove Operation

The remove operation deletes a key-value pair from the map. This operation also involves acquiring a lock on the bin.

Example:

import java.util.concurrent.ConcurrentHashMap

fun main() {
    val map = ConcurrentHashMap<String, Int>()
    map["apple"] = 1
    map["banana"] = 2

    val key = "banana"
    val hash = key.hashCode() // Calculate the hash of the key
    val index = hash and (map.size - 1) // Determine the bin index

    // Acquire a lock on the bin
    synchronized(map) {
        // Remove the key-value pair if it exists
        if (map.containsKey(key)) {
            map.remove(key)
        }
    }

    println("Value for key '$key' after removal: ${map[key]}")
}

Explanation:

The hash of the key "banana" is calculated.
The bin index is determined similarly to the previous operations.
A lock is acquired on the bin to ensure thread safety.
The key-value pair is removed if it exists.

CAS (Compare-And-Swap) Operation

ConcurrentHashMap uses CAS operations for some updates to avoid locking. This is an atomic operation that helps in maintaining thread safety with better performance.

Example:

import java.util.concurrent.ConcurrentHashMap
import java.util.concurrent.atomic.AtomicInteger

fun main() {
    val map = ConcurrentHashMap<String, AtomicInteger>()
    map["apple"] = AtomicInteger(1)

    val key = "apple"
    val newValue = 2

    // CAS operation to update the value
    val currentValue = map[key]
    if (currentValue != null) {
        currentValue.compareAndSet(currentValue.get(), newValue)
    }

    println("Value for key '$key': ${map[key]?.get()}")
}

Explanation:

An AtomicInteger is used to hold the value associated with the key "apple".
The compareAndSet method performs the CAS operation, updating the value atomically if it matches the expected current value.

Best Practices

To make the most of ConcurrentHashMap, follow these best practices:

Use ConcurrentHashMap for High Concurrency:
- Ideal for scenarios with frequent reads and writes by multiple threads.
Avoid Excessive Locking:
- Minimize custom synchronization on the map, as ConcurrentHashMap handles it internally.
Use Appropriate Methods:
- Use methods designed for concurrency, such as putIfAbsent, remove(key, value), and computeIfAbsent.
Limit Size:
- Avoid letting the map grow indefinitely. Consider using an eviction policy for large maps.
Thread-safe Iteration:
- Use ConcurrentHashMap's iterators and methods like forEach, search, and reduce for thread-safe operations.

Pros and Cons

Pros

Thread Safety:
- Safe for concurrent access without additional synchronization.
Performance:
- High performance due to reduced lock contention and efficient algorithms.
Scalability:
- Scales well with increasing number of threads.
Ease of Use:
- Simple to use with intuitive methods for common operations.

Cons

Memory Overhead:
- Slightly higher memory usage compared to HashMap due to additional structures for thread safety.
Complexity:
- More complex internal implementation, which might be overkill for single-threaded scenarios.
Inefficiency for Single-threaded Use:
- Not as efficient as HashMap for single-threaded applications due to additional overhead.

Example in Kotlin

Here’s a simple example to illustrate how to use ConcurrentHashMap in Kotlin:

import java.util.concurrent.ConcurrentHashMap

fun main() {
    val map = ConcurrentHashMap<String, Int>()

    // Add elements to the map
    map["apple"] = 1
    map["banana"] = 2

    // Retrieve an element
    println("Value for key 'apple': ${map["apple"]}")

    // Update an element
    map["apple"] = 3
    println("Updated value for key 'apple': ${map["apple"]}")

    // Remove an element
    map.remove("banana")
    println("Value for key 'banana' after removal: ${map["banana"]}")

    // Compute if absent
    map.computeIfAbsent("orange") { 4 }
    println("Value for key 'orange': ${map["orange"]}")
}

Conclusion

ConcurrentHashMap is a powerful tool for managing shared data in multi-threaded applications. By understanding its internal workings and following best practices, you can leverage its capabilities to

achieve thread-safe and high-performance data management. While it has some drawbacks, the benefits far outweigh them in scenarios requiring concurrent access. Use ConcurrentHashMap wisely, and it will serve as a robust solution for your concurrent programming needs.

That's it for today. Happy Coding...

Understanding ConcurrentHashMap in Kotlin/Java

Table of contents

Introduction

What is ConcurrentHashMap?

Use Case

How ConcurrentHashMap Works Under the Hood

Segmentation

Lock Stripping

Internal Structure

CAS (Compare-And-Swap)

Examples of How ConcurrentHashMap Works

1. Get Operation

2. Put Operation

3. Remove Operation

CAS (Compare-And-Swap) Operation

Best Practices

Pros and Cons

Pros

Cons

Example in Kotlin

Conclusion

Subscribe to my newsletter

Romman Sabbir

Romman Sabbir