Multi-threading vs Multi-Processing

Table of contents
Multi-threading and multi-processing are two main ways we can achieve concurrency in an application.
Concurrency allows us to execute tasks in parallel, which can help us increase throughput and/or reduce latency. For example, having multiple threads handling an endpoint allows us to service multiple users in parallel.
What is multi-threading?
Multi-threading means running more than one thread within the same process at the same time. Since the threads are running under the same process, they share the same memory space and resources (eg. variables, file handles).
What is multi-processing?
A process has its own memory and resources, and multi-processing means running multiple processes in parallel. Since processes do not share a memory space, communication has to happen through inter-process communication, such as pipes, queues or network calls.
When to pick multi-threading vs multi-processing?
The first thing to consider when picking between multi-threading and multi-processing is actually the language the application is in.
Multi-threading in Python
For example, in Python, due to the Global Interpreter Lock, only one thread can execute python code at once. So if the bottleneck is with CPU, the choice should be multi-processing, as multi-threading cannot be achieved in actuality in Python.
In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use
multiprocessing
orconcurrent.f
utures.ProcessPoolExecutor
. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
Multi-threading in Java
Meanwhile, in Java, multithreading is typically the default choice due to robust threading support with the java.util.concurrent
package. The JVM is able to handle multiple threads in parallel, across multiple cores.
Multi-threading vs Multi-processing
Feature | Multi-threading | Multi-processing |
Memory usage | Lower (shared memory) | Higher (separate memory spaces) |
Overhead | Lower (sharing memory and resources means that creation of a new thread requires less resources allocated) | Higher |
Communication | Easier (part of the same process, can communicate through thread-safe shared variables) | Harder (needs inter-process communication) |
Crash isolation | Low (one thread crashing the process would affect all other threads) | Higher (processes are separate) |
Best for | I/O-bound tasks | CPU-bound tasks |
Choosing thread count
When CPU core count matters
If you are running a single thread and the task that is being executed by the thread is maxing out CPU usage, then we say it’s a CPU-bound task. For CPU-bound tasks, core count matters more because the number of threads we can spin up to do more work is bound by the number of cores we have.
Meanwhile, I/O bound tasks are tasks where time is spent waiting for external input/output like reading from files or databases, waiting for API responses etc. For these, we can have many I/O-bound threads running on a single core, as CPU usage per thread is low and so the CPU can switch between threads.
CPU Core Count vs Thread Count for CPU-bound tasks
For CPU-bound tasks, we can choose up to the number of threads possible based on the CPU cores. For example, if I am using a 6-core computer, and each core can only run on thread, then I will be maxing out the core usage for my application if run 6 threads.
It is still possible to have more threads than the cores allow, but doing so will result in the CPU have to context switch to alternative between threads, which could lead in a decline in performance instead.
Conclusion
Ultimately, the choice between multi-threading and multi-processing depends on what constrains your program, CPU or I/O, as well as the language you're using. For CPU-bound tasks, consider your CPU core count and how many threads each core can handle. Regardless of the task type, performance benchmarking is essential to determine the optimal number of threads or processes.
Subscribe to my newsletter
Read articles from Software Engineering Blog directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
