Intro to Multithreading & Concurrency in Java

Ankit SamotaAnkit Samota
25 min read

Recently, I was working on optimizing thread pools for some critical services. While revisiting the basic concepts of multithreading and concurrency in Java, I noticed a significant gap in the available resources: they were either too simplistic or jumped directly into advanced topics.

This blog aims to fill that gap by offering a guide to help beginners move from basic to advanced concurrency concepts. Let's explore the basics of multithreading and understand why it is an essential tool in modern software systems.

Why Multithreading?

Before we talk about multithreading, let’s first talk about threads — and to do that, we need to start with programs and processes.

A program is an executable file stored on disk. When a program is loaded into memory, it is assigned memory and resources, and becomes active — at that point, it is called a process. A single program can have multiple processes. Inside a process, the smallest unit of execution is called a thread. A single process can have multiple threads sharing memory and resources assigned to the same process. When we use multiple threads inside a single process to improve responsiveness or performance, it is called multithreading.

To improve responsiveness

Imagine a weather app that shows live weather information. When a user opens the app, it fetches the weather data from a third-party service — which might take up to 10 seconds to respond. If we use a single thread to handle everything, the app will freeze while waiting for the response, making the user experience frustrating. But if we use multiple threads — one for the UI and one for the network call — the UI can stay responsive while the network thread fetches the data. This is called concurrency.

Concurrency refers to the ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching) on the same CPU core. Concurrent execution of multiple tasks on a single CPU core means the CPU core rapidly switches between tasks, giving the illusion that both tasks are progressing at the same time. Concurrency helps improve responsiveness, especially in I/O-bound applications.

The image illustrates the concept of concurrency on a single CPU core, showing how two tasks (Task 1 in green and Task 2 in pink) take turns executing through time-sharing. The CPU rapidly switches between tasks, creating an illusion of simultaneous execution, though they are actually running in alternating time slots on the same core.

To improve performance/throughput

Let's say we're building an image-processing application for large images. Instead of processing the entire image on a single thread, we can break the image into smaller parts — like cutting it into multiple segments. Then, we can assign each segment to a separate thread running on a different CPU core. This way, multiple segments of the image are processed simultaneously, making the overall processing much faster. This is called parallelism (or parallelization).

Parallelism refers to the ability of a system to execute multiple tasks independently, in parallel, using multiple CPU cores. Unlike concurrency, where tasks take turns executing on a single core, the tasks run side-by-side. Parallelism improves the application's throughput/performance, especially for CPU-bound computations.

This illustration demonstrates parallelism in computing, showing two tasks executing simultaneously on different CPU cores. Task 1 (shown in green) runs on CPU Core 0 while Task 2 (shown in pink) runs on CPU Core 1, with both progressing along the execution time axis. This visual represents true parallel execution where tasks run independently on separate processor cores, improving overall system performance through genuine simultaneous processing.

Thread Creation, Lifecycle, and Coordination

Thread Creation

Java offers two main ways to create threads in Java:

  1. Implementing the Runnable Interface

    • Create a class that implements Runnable interface.

    • Override the run() method to define the code that the thread will execute.

    • Pass an instance of this class to a Thread object and call the start() method.

        class Main {
          static class MyRunnable implements Runnable {
            @Override
            public void run() {
              System.out.println("Thread " + Thread.currentThread().getName() + " is running");
            }
          }
      
          public static void main(String[] args) {
            MyRunnable runnable = new MyRunnable();
            Thread thread1 = new Thread(runnable);
            thread1.start();
            Thread thread2 = new Thread(() -> doWork()); // Since Java 8, We can use lambda expressions
            thread2.start();
          }
        }
      
  2. Extending the Thread Class

    • Create a class that extends Thread class.

    • Override the run() method.

    • Instantiate and call start() on the class directly.

        class Main {
          static class MyThread extends Thread {
            @Override
            public void run() {
              System.out.println("Thread " + Thread.currentThread().getName() + " is running");
            }
          }
      
          public static void main(String[] args) {
            Thread thread1 = new MyThread();
            thread1.start();
          }
        }
      

Which one should we use?

Both approaches work, but Runnable is generally preferred, especially with thread pools or executor frameworks. It's more flexible and separates the task logic from the thread creation.

Thread lifecycle

A thread in Java can exist in one of the following states during its lifecycle:

  • NEW – Thread has been created but has not yet been started.

  • RUNNABLE – Thread is running or ready to run (i.e., waiting for the OS to schedule it).

  • BLOCKED – Thread is waiting to acquire an intrinsic lock (e.g., synchronized blocks or methods).

  • WAITING – The thread is waiting indefinitely for another thread's action (e.g., wait(), join() without timeout).

  • TIMED_WAITING – Thread is waiting for a fixed time (e.g., sleep(), wait(timeout), join(timeout)).

  • TERMINATED – Thread has finished execution or stopped due to an exception or interruption.

Thread Coordination

Thread coordination ensures that multiple threads can cooperate and communicate effectively while accessing shared resources. Java provides several mechanisms for thread coordination:

  • join(): Makes one thread wait for another thread to complete execution.

      thread1.start();
      thread1.join(); // Current thread waits for thread1 to finish
    
  • interrupt(): Used to signal a thread to stop by setting its interrupt flag. It does not guarantee that the thread will be terminated; the thread must check and respond to the interrupt itself.

      Thread thread = new Thread(() -> {
        while (!Thread.currentThread().isInterrupted()) {
          // Work
        }
      });
      thread.start();
      thread.interrupt();
    
  • wait()/notify(): Used for coordination between threads using intrinsic locks. One thread waits (wait()), and another notifies (notify()).

      synchronized (lock) {
        lock.wait();    // Releases lock and waits
        lock.notify();  // Wakes up a waiting thread
        lock.notifyAll(); // Wakes up all waiting threads
      }
    
  • synchronized: Ensures mutual exclusion; only one thread at a time can access a block or method.

      synchronized (this) {
        // Critical section, only one thread can access at a time
      }
    

    In later sections, we will learn more advanced coordination techniques using Locks and Conditions.

Daemon Threads

Daemon threads are background service threads that run behind the scenes to support the main application (or user threads). They are mainly used for background tasks like garbage collection, monitoring, etc. Here are a few things to keep in mind -

  • JVM exits when all user threads finish, even if daemon threads are still running.

  • When the JVM shuts down, Daemon threads are terminated, even if they have not finished executing.

  • Threads must be marked as daemon before calling the start() method. Otherwise, it will throw an IllegalThreadStateException.

Thread t = new Thread(() -> {});
t.setDaemon(true); // Must be before start()
t.start();

Understanding the Java Memory Model (JMM)

The Java Memory Model

The Java Memory Model (JMM) defines how threads in Java interact through memory. Let’s break down the core problems it solves.

  1. Visibility

    In multithreaded code, a thread might update a shared variable, but another thread might not immediately see that update. For example - if Thread A (on CPU1) modifies a shared variable in its local cache and doesn’t flush it to the main memory, Thread B (on CPU2) might keep reading an outdated value from the main memory.

    We can solve this problem using the volatile keyword. The volatile keyword forces all reads and writes to go directly to main memory, never a thread-local cache. So if one thread updates a volatile variable, the other thread is guaranteed to see the latest value.

  2. Instruction Reordering

    The JVM and modern CPUs reorder instructions for optimization as long as the outcome is the same for a single-threaded program.

    For example -

     int a = 1;
     int b = 2;
    
     a++;
     b++;
    

    This might be reordered to:

     int a = 1;
     a++;
    
     int b = 2;
     b++;
    

    That’s fine in a single thread; the result is still correct. But across multiple threads, this can break things. For example -

     class Singleton {
         private static Singleton instance;
    
         public static Singleton getInstance() {
             if (instance == null) {
                 instance = new Singleton(); // not atomic!
             }
             return instance;
         }
     }
    

    Here, instance = new Singleton() looks like a single operation, but internally it’s:

     memory = allocate(); // Reserve space in heap memory for the new object
     ctor(memory);        // Run the constructor to initialize the object's fields and state  
     instance = memory;   // Assign the memory address to the instance reference variable
    

    And the compiler may reorder this to:

     memory = allocate();
     instance = memory; // reference assigned before constructor finishes
     ctor(memory);
    

    Now, another thread might see that the instance is non-null and use it before the constructor has even completed execution. Again, we can use the volatile keyword to fix this problem.

     private static volatile Singleton instance;
    

    Using the volatile keyword here enforces the happens-before relationship. The volatile keyword tells the JVM to ensure strict visibility and ordering guarantee between threads. It ensures that the writes to a volatile variable happen and flush to the main memory before every subsequent reads of that variable.

  3. Race Conditions

    A race condition occurs when two threads modify the same variable simultaneously without proper synchronization, and the result depends on timing.

    Even a simple line like count++ is not atomic. It breaks down to:

     1. Read count       // CPU loads the current value from memory into a register
     2. Increment it     // CPU adds 1 to the value in the register  
     3. Write it back    // CPU stores the updated value from register back to memory
    

    Now imagine:

    • Thread A and Thread B both read count = 2

    • Both increment to 3

    • Both write 3 back

    • We lost one increment. The count should have been 4.

We can use the synchronized keyword to make the critical section thread-safe.

    synchronized (lock) { 
        count++;
    }

We will talk more about the synchronized keyword in later sections.

The "Happens-Before" Guarantee

The happens-before rule is at the heart of JMM: If A happens before B, then every write done by A is visible to B. Some examples of built-in happens-before rules:

  • A write to a volatile variable happens before every read of that variable.

  • A call to Thread.start() happens before any actions are taken in that thread.

  • A synchronized block’s unlock happens before the next lock on the same object.

JVM Memory Structure (Heap vs Stack)

Stack

The stack is a memory region that stores method calls, input parameters, and local variables.

  • Every thread in Java has its own stack, which means that local variables and method calls in one thread are isolated from those in others.

  • The stack keeps track of the sequence of method calls - it is often referred to as the call stack.

  • The combination of the stack and the instruction pointer (which tracks the next line to execute) represents a thread's current state.

  • Since the stack is private to a thread, data stored there is not visible to other threads.

  • The stack is allocated when the thread is created, and its size is fixed (depending on the JVM and OS settings).

  • If a thread calls too many nested methods (deep recursion), it can exceed the stack size limit, resulting in a StackOverflowError.

Heap

The heap is a shared memory region used to store objects, instance variables, static fields, and class metadata.

  • All threads share the heap memory.

  • Objects created using the new keyword are stored in the heap memory.

  • The Java Garbage Collector manages heap memory — automatically frees memory used by objects no longer referenced.

  • An object stays in the heap as long as any part of the program references it.

  • Fields of an object and instance members live in the heap alongside the object.

  • Static variables (declared with the static keyword) also reside in the heap and are shared across all instances of a class. These exist for the lifetime of the class in memory.

Synchronization and Locks

Before discussing locks and synchronization, let’s understand why we need them.

As we saw earlier, race conditions, data inconsistency, and weird bugs can occur when multiple threads access a shared mutable state. That’s because operations like count++ seem straightforward but aren’t atomic. Most operations break down into multiple steps (read, modify, write), and if two threads execute them simultaneously, it can lead to incorrect results.

To prevent this, we protect the code that accesses shared resources using locks. That section of code is called a critical section.

A critical section is any block of code that must be executed by only one thread at a time to avoid corrupting shared data.

Now let’s look at the tools Java provides us to protect critical sections:

synchronized keyword

Java provides a built-in keyword called synchronized to ensure mutual exclusion**,** meaning only one thread can execute a code block simultaneously. We can use it in two ways:

Synchronized Methods

public synchronized void increment() {
    a++;
}

Here, the thread acquires a lock on the current object (this) before executing the method. If another thread tries to call any other synchronized method on the same object, it will block until the lock is released.

It is an example of coarse-grained locking — one lock guards a large section of logic.

Synchronized Blocks

private static final Object lock1 = new Object();
private static final Object lock2 = new Object();

public void update() {
    synchronized (lock1) {
        a++;
    }

    synchronized (lock2) {
        b++;
    }
}

Here, we acquire locks on specific objects, giving us more control. Using different locks, multiple threads can run different synchronized blocks in parallel.

It is an example of fine-grained locking and can help improve performance.

Intrinsic lock

In Java, every object has a built-in lock called an intrinsic lock or monitor. When we use the synchronized keyword, we acquire this monitor lock behind the scenes.

  • How it works:

    • When a thread wants to enter a synchronized block or method, it tries to acquire the lock.

    • If the lock is free, the thread enters and starts executing.

    • If another thread holds the lock, it waits until the lock becomes available.

    • When the thread exits the synchronized block, the lock is automatically released.

  • Limitations:

    • It’s simpler but more rigid — no control over fairness, timeouts, or interruption.

    • A thread will wait indefinitely if it can't acquire the lock.

    • There is no explicit fairness policy, and threads acquire the lock in a non-deterministic order.

    • Threads waiting to acquire a synchronized lock cannot be interrupted, unlike ReentrantLock, which supports lockInterruptibly().

    • It can cause deadlocks if used carelessly.

ReentrantLock

The ReentrantLock class is a more flexible alternative to the synchronized keyword. It provides fine control over locking, such as:

  • Try acquiring the lock without blocking (tryLock())

  • Wait with timeout

  • Respond to interrupts while waiting

  • Fairness in locking

private static final ReentrantLock lock = new ReentrantLock();
private static final ReentrantLock fairLock = new ReentrantLock(true);
private static int a = 0;

public void increment() {
    lock.lock();  // acquire
    try {
        a++;
    } finally {
        lock.unlock(); // always release
    }
}

Always use a try-finally block to ensure the lock is released, even if exceptions are thrown.

Reentrancy

Just like synchronized, ReentrantLock is reentrant. A thread can acquire the lock multiple times (e.g., in recursion). Internally, the lock keeps a hold count, which is incremented on each lock and decremented on each unlock.

Limitations

  • It offers control and flexibility, but requires careful management.

  • The lock must be released manually — easy to forget or mess up.

  • It’s not ideal for read-heavy workloads (use ReadWriteLock in that case).

  • It can cause deadlocks if used carelessly.

Locking Strategies

Coarse-Grained Locking: One lock guards everything. It's simple but hurts concurrency.

Fine-Grained Locking: Separate locks for different resources. It improves performance but is harder to get right.

Deadlocks

Deadlock happens when two or more threads wait on each other forever. Here's a classic deadlock scenario: two threads acquire locks in opposite order.

// Timeline: T1 gets lock1 → T2 gets lock2 → T1 waits for lock2 → T2 waits for lock1
public class DeadlockExample {
    private static final Object lock1 = new Object();
    private static final Object lock2 = new Object();

    public static void main(String[] args) {
        Thread t1 = new Thread(() -> {
            synchronized (lock1) {
                System.out.println("Thread 1: Holding lock1...");
                try { Thread.sleep(100); } catch (InterruptedException e) {}

                synchronized (lock2) {
                    System.out.println("Thread 1: Acquired lock2!");
                }
            }
        });

        Thread t2 = new Thread(() -> {
            synchronized (lock2) {
                System.out.println("Thread 2: Holding lock2...");
                try { Thread.sleep(100); } catch (InterruptedException e) {}

                synchronized (lock1) {
                    System.out.println("Thread 2: Acquired lock1!");
                }
            }
        });

        t1.start();
        t2.start();
    }
}
  • Thread 1 acquires lock1, then tries to acquire lock2.

  • Thread 2 acquires lock2, then tries to acquire lock1.

  • Both threads wait on each other indefinitely; this causes a deadlock.

Conditions necessary for deadlock -

  1. Mutual Exclusion – Only one thread can hold a particular lock at any given time

  2. Hold and Wait – Thread keeps its current locks while waiting to acquire additional locks

  3. No Preemption – The system cannot forcibly remove locks from threads that are holding them

  4. Circular Wait – Threads form a loop where each waits for a lock held by another thread

Ways to Avoid Deadlock -

  • Always acquire locks in a fixed global order

  • Use tryLock() to avoid blocking forever

  • Avoid long-running operations while holding locks

  • Avoid nested locks unless necessary

Inter-thread Communication

Inter-thread communication allows multiple threads to coordinate their actions and share data safely. While locks help us protect shared resources, sometimes threads must wait for other threads to complete work or notify other threads when they complete work.

Let's take an example of a producer-consumer system in which producer threads put tasks into a queue, and consumer threads pick up those tasks and process them.

Without proper coordination, consumers would constantly check if there are new tasks in the queue, even when producers haven't added anything. It is an example of Busy-Wait.

while (queue.isEmpty()) {
        // Keep checking... wastes CPU cycles!
}
Task task = queue.poll();

A better approach is for producer threads to notify consumer threads whenever new tasks are added to the queue. That way, consumers don’t have to check the queue constantly. It is an example of Blocking-Wait.

synchronized (lock) {
    while (queue.isEmpty()) {
        lock.wait(); // Wait until notified
    }
    Task task = queue.poll();
}

wait(), notify(), notifyAll()

Java provides three key methods for thread coordination:

  • wait() - The current thread releases the lock and waits until notified

  • notify() - Wakes up one waiting thread

  • notifyAll() - Wakes up all waiting threads

Note: These methods must be called inside a synchronized block on the same object.

synchronized (lock) {
    while (!conditionMet) {
        lock.wait();// Releases lock and waits
    }
}

synchronized (lock) {
    conditionMet = true;
    lock.notify(); // Wake up a waiting thread
}

Let’s revisit the producer-consumer example -

Object lock = new Object();
Queue<Task> queue = new LinkedList<>();
int maxSize = 10;

public void put(Task task) throws InterruptedException {
    synchronized(lock) {
        while (queue.size() >= maxSize) {
            lock.wait(); // Queue full, producer waits for tasks to be picked up
        }

        queue.add(task);
        lock.notifyAll(); // Wake up waiting consumers
    }
}

public Task take() throws InterruptedException {
    synchronized (lock) {
        while (queue.isEmpty()) {
            lock.wait(); // Queue is empty, consumer waits for tasks to be added
        }

        Task task = queue.poll();
        lock.notifyAll(); // Wake up waiting producers
        return task;
    }
}

The problem with wait()/notify() is that all threads wait on the same object. When you call notifyAll(), it wakes up all threads, producers and consumers alike, even when only one type should wake up. This approach is better than Busy-Wait, but it is still inefficient.

Condition Variables

Java's Condition interface solves the limitations of wait()/notify() by providing multiple waiting queues per lock.

ReentrantLock lock = new ReentrantLock();
Condition notEmpty = lock.newCondition(); // For consumers
Condition notFull = lock.newCondition(); // For producers
Queue<Item> queue = new LinkedList<>();
int maxSize = 10;

public void put(Task task) throws InterruptedException {
    lock.lock();
    try {
        while (queue.size() >= maxSize) {
            notFull.await(); // Only producers wait here
        }

        queue.add(task);
        notEmpty.signal(); // Wake up one waiting consumer
    } finally {
        lock.unlock();
    }
}

public Task take() throws InterruptedException {
    lock.lock();
    try {
        while (queue.isEmpty()) {
            notEmpty.await(); // Only consumers wait here
        }

        Task task = queue.poll();
        notFull.signal(); // Wake up one waiting producer
        return task;
    } finally {
        lock.unlock();
    }
}

Key Advantages of Condition Variables:

  1. Targeted signalling: notEmpty.signal() only wakes consumers, notFull.signal() only wakes producers

  2. Better performance: No unnecessary thread wake-ups

  3. Cleaner code: The intent of the developer is clear

  4. Additional features: Timeout support with await(timeout, TimeUnit), etc.

Performance Optimization & Thread Pooling

Now that we understand how threads communicate and coordinate let's tackle an important question: How many threads do we need?

Creating threads comes with costs. Too many threads can harm application performance due to the overhead of context switching. On the other hand, having too few threads might leave our CPU underutilized.

Cost of Thread Creation

Whenever we create a new thread with new Thread(), several expensive operations happen behind the scenes:

  • Memory allocation: Each thread is assigned memory for its stack

  • OS overhead: The operating system needs to register and manage the thread

  • Initialization: Setting up thread-local storage and other metadata

Now, when we have more threads than CPU cores, the OS must switch between threads to allow each thread to run. This is called context switching.

During a context switch, the CPU must:

  • Save the current thread’s state (registers, program counter, etc.)

  • Load the next thread’s state

  • Update memory mappings

Although each context switch happens within a few microseconds, this can quickly add up if we have many threads, leading to performance degradation.

Thread Pools

Instead of creating new threads for every task, thread pools maintain a collection of pre-created, reusable threads that execute tasks from a shared queue.

// Without thread pool: expensive thread creation
for (int i = 0; i < 1000; i++) {
    new Thread(() -> processTask()).start(); // Creates 1000 threads!
}

// With thread pool: reuse existing threads
ExecutorService pool = Executors.newFixedThreadPool(10); // Only 10 threads
for (int i = 0; i < 1000; i++) {
    pool.submit(() -> processTask()); // Tasks queued, threads reused
}

Key benefits of Thread Pools:

  • Eliminates thread creation overhead as threads are created once and reused

  • Controls resource usage by the limiting number of concurrent threads

  • Provides task queuing to prevent the creation of an unlimited number of threads

  • Improves performance as threads are not created and destroyed repeatedly

Thread Models

  • Thread-per-Task Model involves creating a new thread for every individual task or request. This model is simple to understand and implement, but thread creation is costly and can potentially crash the system due to the unlimited number of threads.

  • Thread-per-Core Model uses a fixed number of threads equal to (or slightly more than) the number of CPU cores. This model offers predictable resource usage and stable performance, but it may not fully utilize resources for I/O-bound tasks and is more complex to implement.

Blocking vs Non-Blocking I/O

Blocking I/O: When a thread makes a blocking I/O call (database query, HTTP request, file read, etc.), it sits idle, waiting for the response. During this time, the thread consumes memory but does no useful work.

public String fetchData() {
    // Thread sits idle waiting for HTTP response
    String res = httpClient.get("<http://www.example.com/users/>");
    return res;
}
  • Benefits:

    • It is simpler to implement and understand, as the code executes sequentially, waiting for I/O operations to complete before proceeding.

    • It is easier to debug due to its linear flow.

    • Suitable for applications with low concurrency or where a dedicated thread can handle each request.

  • Limitations:

    • It can lead to performance issues with high concurrency, as each I/O operation blocks the thread, potentially creating many threads and consuming significant resources.

    • Poor scalability due to the thread-per-request model.

    • It can result in thread starvation if many threads are blocked waiting for I/O.

Non-Blocking I/O: With non-blocking I/O, threads don’t sit idle. When an I/O operation starts, the thread can work on other tasks and come back when the I/O is complete.

public CompletableFuture<String> fetchDataAsync() {
    return httpClient.getAsync("<http://www.example.com/users/>")
                     .thenApply(users -> users.get(0));
}
  • Benefits:

    • Allows a single thread to manage multiple I/O operations concurrently, improving resource utilization.

    • Better performance for applications with high concurrency, as threads are not blocked while waiting for I/O.

    • It avoids thread starvation by allowing threads to switch to other tasks while I/O operations are in progress.

  • Limitations:

    • More complex to implement and manage

    • More difficult to debug due to non-linear flow and potential for race conditions

    • It can result in callback hell if not managed properly, leading to difficult-to-read and maintain code.

A general rule of thumb is to use the Thread-Per-Core model for CPU-intensive tasks and the Thread-Per-Core model with Non-Blocking I/O for I/O-intensive tasks. However, we can't directly apply the Thread-Per-Core model with Non-Blocking I/O in most real-world situations because we need to balance code maintainability with performance. Therefore, we use thread pools with predefined sizes and fine-tune these thread pools to fit our specific needs.

Thread Pool Size

CPU-Bound Tasks

  • These tasks primarily involve computation and heavily utilize CPU resources.

  • Number of threads = Number of CPU cores

  • More threads than cores can lead to excessive context switching and degrading performance.

I/O-Bound Tasks

  • These tasks spend significant time waiting for I/O operations (e.g., network requests, disk reads, etc.)

  • Number of threads = (Number of CPU cores * (1 + Wait Time/Compute Time))

  • Threads can get blocked while waiting for I/O, so having additional threads allows other tasks to proceed.

That being said, there is no one-size-fits-all formula for thread pool size. We should add proper monitoring for all thread pools and monitor the service performance to fine-tune the thread pool sizes.

ThreadPoolExecutor Deep Dive

ThreadPoolExecutor is the core implementation of thread pools in Java.

ThreadPoolExecutor executor = new ThreadPoolExecutor(
    corePoolSize,        // Always keep these threads alive
    maximumPoolSize,     // Maximum threads allowed
    keepAliveTime,       // How long extra threads stay idle before dying
    TimeUnit.SECONDS,    // Time unit for keepAliveTime
    taskQueue,           // Queue to hold pending tasks
    threadFactory,       // How to create new threads
    rejectionHandler     // What to do when overwhelmed
);
New Task Submitted
        │
        ▼
    Number of threads < corePoolSize?
        │
    ┌───Yes────────────No───┐
    ▼                       ▼
Create new               Queue full?
core thread                 │
                     ┌─────No──────Yes─┐
                     ▼                 ▼
                Add to queue      Total threads < maxPoolSize?
                                      │
                                 ┌───Yes────No───┐
                                 ▼               ▼
                            Create extra      Reject
                             thread           task

Task Queue

The task queue is used to hold tasks waiting to be executed. When a task is submitted to the executor and the number of threads < corePoolSize, it is placed in the work queue. A few examples of queue implementations that can be used are -

// Tasks queue up without limit, may cause memory issues if overwhelmed
// Thread Pool never creates threads beyond corePoolSize since queue never fills
new LinkedBlockingQueue<>() 

// Tasks queue up to limit, then new threads created, predictable memory usage
// Thread Pool creates extra threads only when queue is full, balanced resource usage
new ArrayBlockingQueue<>(100) 

// Tasks must be immediately picked up by waiting thread, no queuing allowed
// Thread Pool creates new thread for each task up to maximumPoolSize, highly responsive
new SynchronousQueue<>() 

// Tasks queue without limit but highest priority tasks execute first
// Thread Pool never creates threads beyond corePoolSize, but critical tasks execute first
new PriorityBlockingQueue<>()

Thread Factory

The ThreadFactory is responsible for creating new threads when the pool needs them. We have two options when it comes to thread factories -

  • Use the default Thread Factory: Executors.defaultThreadFactory()

  • Write a custom Thread Factory: Implement ThreadFactory interface

Rejection Handlers

Rejection handlers handle tasks that cannot be executed when the executor is saturated. A few examples of different rejection handlers are -

// This is the default policy, which throws a RejectedExecutionException 
// when a task cannot be accepted for execution.
new ThreadPoolExecutor.AbortPolicy()

// Instead of discarding the task, this policy runs the task in the thread 
// that calls the execute() method, effectively pushing back the task to the caller.
new ThreadPoolExecutor.CallerRunsPolicy()

// This policy silently discards the rejected task without any notification.
new ThreadPoolExecutor.DiscardPolicy()

// This policy discards the oldest unhandled request and then retries executing the new task.
new ThreadPoolExecutor.DiscardOldestPolicy()

// 5. Custom policy
new RejectedExecutionHandler() {
    public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
        // Log, retry, send to another queue, etc.
    }
}

Different Thread Pool Types

Fixed Thread Pool

// Executors.newFixedThreadPool(nThreads)
public static ExecutorService newFixedThreadPool(int nThreads) {
    return new ThreadPoolExecutor(
        nThreads,                    // corePoolSize = nThreads
        nThreads,                    // maximumPoolSize = nThreads (same as core)
        0L,                          // keepAliveTime = 0 (core threads never timeout)
        TimeUnit.MILLISECONDS,       // time unit
        new LinkedBlockingQueue<Runnable>(),  // unbounded queue
        Executors.defaultThreadFactory(),     // default thread factory
        new ThreadPoolExecutor.AbortPolicy()  // default rejection policy
    );
}

Single Thread Executor

// Executors.newSingleThreadPoolExecutor()
public static ExecutorService newSingleThreadPoolExecutor() {
    return new ThreadPoolExecutor(
        1,                           // corePoolSize = 1
        1,                           // maximumPoolSize = 1
        0L,                          // keepAliveTime = 0 seconds
        TimeUnit.MILLISECONDS,       // time unit
        new SynchronousQueue<Runnable>(),    // no storage capacity
        Executors.defaultThreadFactory(),    // default thread factory
        new ThreadPoolExecutor.AbortPolicy() // default rejection policy
    );
}

Cached Thread Pool

// Executors.newCachedThreadPool()
public static ExecutorService newCachedThreadPool() {
    return new ThreadPoolExecutor(
        0,                           // corePoolSize = 0 (no core threads)
        Integer.MAX_VALUE,           // maximumPoolSize = unlimited
        60L,                         // keepAliveTime = 60 seconds
        TimeUnit.SECONDS,            // time unit
        new SynchronousQueue<Runnable>(),    // no storage capacity
        Executors.defaultThreadFactory(),    // default thread factory
        new ThreadPoolExecutor.AbortPolicy() // default rejection policy
    );
}

Custom Thread Pool

// Executors.newCustomThreadPool(coreThreads, maxThreads)
public static ExecutorService newCustomThreadPool(int coreThreads, 
                                                  int maxThreads) {
        return new ThreadPoolExecutor(
            coreThreads,                      // corePoolSize
            maxThreads,                       // maximumPoolSize
            60L,                              // keepAliveTime
            TimeUnit.SECONDS,                 // time unit
            new LinkedBlockingQueue<>(100),           // workQueue
            Executors.defaultThreadFactory(),         // default thread factory
            new ThreadPoolExecutor.CallerRunsPolicy() // rejectionHandler
        );
}

Performance Monitoring and Tuning

We should constantly monitor our thread pools in production. Key metrics we should look out for are -

  • Pool Size: Current number of threads in the thread pool

  • Queue Size: Tasks waiting to be executed in the queue

  • Active Thread Count: Threads currently executing tasks

Warning signs:

  • Queue constantly growing → Need more threads or faster processing

  • Low CPU utilization with I/O tasks → Pool too small

  • High CPU with context switching → Pool too large

Conclusion

In conclusion, understanding multithreading and concurrency in Java is essential for creating efficient and responsive software systems. By using threads, we can enhance application responsiveness and performance. The Java Memory Model, synchronization techniques, and thread coordination mechanisms are crucial for managing shared resources and ensuring data consistency. Optimizing thread pools and understanding the differences between blocking and non-blocking I/O can significantly improve application performance.

Here are a few topics you might want to check out next to dive deeper into concurrency:

  • Lock-free programming: Use AtomicInteger and CompareAndSwap for high-performance scenarios.

  • Async patterns: Use CompletableFuture chains for non-blocking operations.

  • Modern Java: Explore virtual threads (Project Loom) for massive concurrency.

  • Reactive programming: Learn about RxJava/Project Reactor for stream processing.

13
Subscribe to my newsletter

Read articles from Ankit Samota directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ankit Samota
Ankit Samota

SDE at Amazon with a passion for building scalable systems. Currently exploring the fascinating world of System Design and Distributed Systems