Boosting Node.js Performance: A Comprehensive Guide to Worker Threads
Introduction
While node.js is a single threaded language , it can still be made to achieve multi threading. Node.js can scale well due to its non-blocking I/O feature.It can perform very well, but what about cases where you need to perform a cpu intensive tasks like calculations or processing of large number of files ?
The event loop will get blocked isn't it ?
Suppose we run a for loop from 1 to 100000 no matter how impressive node.js
non-blocking I/O feature is , the event loop will still remain occupied.
This will result in unresponsive and slow application.
So how can we solve this ?
Now imagine we could take the same CPU intensive work and offload it to somewhere else and when the solution is calculated it is returned back thereby
not blocking the event loop. That is exactly what we going to learn today.
Introducing Worker Threads
worker thread is basically helps us in running a piece of code separately , get the result and returned it back to the main program.
Imagine you are using a separate server for intensive calculation and you make a api call whenever you need result of the intesive calculation.This separate server handles the heavy lifting, allowing your main application to remain responsive and continue processing other tasks. Worker threads in Node.js operate on a similar principle, but within the same application.
Worker thread helps to run a given piece of code in a isolated
environment where it maintains its own memory and event loop.
If suppose your main program "main-program" is running and it has to perform a heavy computation. It will offload that task to "worker" script using worker thread feature. The "worker" will perform the given task and give the result back to the
"main-program". The "main-program" does all of this in asynchronous way.
It does not block the main thread waiting for the "worker" to finish its job.
Let us look via code examples how exactly it is done
Code Examples
Below is a piece of code where there is a for loop present running from 1 to a large number
const heavyComputation = ({num}) => {
// Some heavy computation
let result = 0;
for (let i = 0; i < 1000000 * num; i++) {
result += i;
}
console.log('Heavy Computation done...')
return result;
};
console.log(heavyComputation({ num: 42 }))
What will happen in the above code is that the main thread will get blocked due to
this large for loop present.
Let us look at the same piece of code but this time let us use worker-threads.
main.js
// main.js
const { Worker } = require('worker_threads');
const heavyComputation = (workerData) => {
return new Promise((resolve, reject) => {
const worker = new Worker('./worker.js', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
};
heavyComputation({ num: 42 })
.then((result)=>{
console.log(`Result from worker: ${result}`)
})
.catch(err => console.error(err));
worker.js
// worker.js
const { parentPort, workerData } = require('worker_threads');
const compute = (num) => {
// Some heavy computation
let result = 0;
for (let i = 0; i < 1000000 * num; i++) {
result += i;
}
console.log('Heavy Computation done from worker thread')
return result;
};
parentPort.postMessage(compute(workerData.num));
The above files are both in the same directory.
code example:https://replit.com/@SaishBorkar/Worker-Thread-Example#index.js
You can fork above link and play around with the code.
Above example you can see that when worker_thread is used it will offload the heavy duty work to a worker. This way the original program is free and event loop is free to process other requests. When the work is done by worker_thread it will give the result back to the main program.
You can see how event driven paradigm is used to communicate between the "main" program and the "worker" in the above code example.
Considerations and Trade-offs
Programming approaches are all about trade-offs so while this is a good way we
can make use of multi threading in nodejs but there are some things we need to keep in mind.
1. So each worker maintains its own memory and an event loop.
2. There is an overhead due to the communication between the main program
and the worker program.
Another thing that came to my mind when learning about worker threads is
what happens if we use it in an api and that api is hit 100 times per second ?
Will new worker be created for each calls ?
That is actually true and we need to avoid that as we know that each worker
will maintain its own event loop and memory. It will take a share from the overall system memory. This could even crash the system.
To better handle this situation usually a fixed number of worker are created
which generally taken same as the number of cores present in your system.
A queue system is used to use those worker based on their availability.
Using worker thread is preferred for CPU intensive tasks. This is because due to using worker thread the app basically makes use of all the cores of the system. Which in a traditional appraoch only single core is used in nodejs.
Real-World Scenarios
So below are some example where worker thread is preferred as they are all CPU intensive tasks.
Complex mathematical calculations (e.g., prime number generation, factorials)
Cryptographic operations (e.g., hashing, encryption)
Image or video processing (e.g., applying filters, resizing)
Data compression or decompression
Parsing and processing large datasets in memory
Sorting large arrays or complex data structures
All the above tasks where heavy calculations are needed to be done which basically makes the use of the CPU, hence worker thread is recommended to be used.
There are also times when using worker threads not recommended, so if the tasks involves heavy I/O operations it is recommended to use the native
asynchronous prowess of node.js.
Conclusion
Node.js, despite its single-threaded nature, offers powerful capabilities for handling various types of tasks. While its event-driven, non-blocking I/O model excels at managing I/O-intensive operations, CPU-bound tasks have traditionally posed a challenge. This is where worker threads come into play, providing a robust solution for CPU-intensive operations without compromising the responsiveness of your application.
By leveraging worker threads, we can:
Offload heavy computations to separate threads, keeping the main event loop free.
Utilize multi-core processors more effectively, improving overall performance.
Maintain application responsiveness even during complex calculations.
However, it's crucial to remember that worker threads are not a one-size-fits-all solution. They come with their own set of considerations, including memory overhead and inter-thread communication costs. For I/O-bound tasks, Node.js's built-in asynchronous model remains the preferred approach.
When implementing worker threads, consider using a worker pool to manage resources efficiently, especially in high-traffic scenarios like API servers. This approach allows you to balance the benefits of parallelism with the need for controlled resource usage.
As we've seen through our examples, worker threads open up new possibilities for Node.js applications, allowing developers to tackle CPU-intensive tasks such as complex calculations, data processing, and cryptographic operations without sacrificing the performance benefits that drew us to Node.js in the first place.
By understanding when and how to use worker threads, you can take full advantage of Node.js's capabilities, creating more efficient, responsive, and powerful applications. As you move forward, experiment with these techniques in your own projects, always keeping in mind the specific needs of your application and the trade-offs involved in different concurrency models.
sources:
1. https://nodejs.org/api/worker_threads.html
Subscribe to my newsletter
Read articles from Saish Borkar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by