To deliver optimal user experiences and maintain system health, it's crucial to build Node.js applications that are not only stable but also efficient. A well-optimized application can handle increased traffic, minimize response times, and reduce resource consumption.

In this blog series, we are exploring the three key pillars for building resilient Node.js applications:

Stability: Ensuring consistent and predictable behavior.
Efficiency: Optimizing resource utilization and minimizing latency.
Adaptability: Adapting to changing workloads and scaling seamlessly.

In our last blog, we explored how to build a stable Node.js application.

In this post, we'll focus on efficiency, looking at caching, handling session data and more.

To be efficient, you need to reduce, simplify, or otherwise eliminate work you don’t need to do, and find clever ways to not spend more cycles on things than is worthwhile. If you want to scale without blowing up your cloud cost, you’ll need to get the most of the machines you’re working with.

A focus on efficiency gets you the most from the resources you have. You’d be surprised what you can do with a single machine and some careful tuning combined with efficiency-minded architecting.

Let’s look at some tips for what you can be doing to boost efficiency.

Query and resource caching

One of the most common caching strategies is using a CDN. This provides a great boost to performance when serving static resources. But in modern applications, there are generally many layers of data transmission which could be cached to cover other forms of static data.

Database queries can be highly dynamic with changing inputs, but combining query and inputs together as a cache key rather than just the query string alone can produce much more consistent and therefore cacheable results. Be careful with the volume of cache key permutations when caching from variable inputs. In many cases, you will likely want a least-recently-used caching strategy to drop any data from the cache that is not used often.

A tool like async-cache-dedupe can be very helpful with managing caching and invalidation of database queries. For GraphQL, consider mercurius-cache for a better integrated solution for GraphQL specifically.

Cache at every level

Good caching is shaped like an onion, with layers of caching over each level of data dependency. In doing so, the application can skip work at many stages along the path from the database to the end user. If a database query is cached and produces a consistent output, then it follows that a component which renders that output to html may also be unchanging and therefore cacheable. It follows further that if all the components used within another are unchanging, then the container may also be considered consistent, and this follows all the way up to the entire page output potentially being fully cacheable.

With careful use of nested caching strategies, an application can eliminate a substantial amount of its network traffic, eliminating major latency costs and leaving the network free to more efficiently cover the data that really needs transmission.

Nested caching has many advantages, but it makes one of the hardest problems of caching even harder: cache invalidation. When nesting cache values, each consumer which uses that value to produce another cacheable value needs to be able to invalidate their cache when the value they consume invalidates its own cache. This requires defining a dependency algorithm to propagate cache invalidations to any consumers of a cached value.

If a database row is cached and then that cached data is used to produce a cached rendering of html for that row, then changing the value of that row needs to not only invalidate the row cache but also invalidate the cache of the html rendering which depended on that row.

Onion-caching lacks convenient modules to manage in Node.js, but it can be managed manually with async-cache-dedupe and a manually constructed tree to keep track of cache key dependencies.

Do non-critical work out-of-band

Message queues can be a blessing when you have bursty traffic, especially for tasks which might not need immediate attention. A common use for message queues is sending emails. Whether hosting the email server yourself or going through an email service, you may encounter processing delays or API connections being throttled. With message queues, you can store that work to be processed whenever a worker process gets to it.

If you had to do that work within the processing of a request, you may find it negatively impacts the consistency and performance of your application. By taking that work out-of-band from the request cycle, a worker can pick it up when it’s not busy.

Pushing tasks out to message queues also helps with reliability as a long-running request risks being terminated mid-processing when another request fails. Message queues typically have a mechanism of requiring acknowledgement of completion before the task is removed from the queue, so if the worker goes down the task will get picked back up eventually.

You can try the mqemitter family of modules to get started with using message queues in your application.

Don’t hold state

When you scale beyond a single process, you need to think about how to share session data between multiple processes. Applications should avoid holding any sort of session state in-memory as it prevents effective scaling and load balancing. A typical strategy is to store session data in Redis/Valkey or some other cache or key/value store.

Beyond just sessions, it’s best to avoid holding any data you don’t absolutely need. More memory pressure means more garbage collection and less space to work with for the rapid allocations of request processing data. By keeping your long-lived data footprint light, you can be sure your application has ample headroom for processing requests quickly.

Keep an eye out for any data that might be held longer than the life of a request and consider if that can live in external storage somewhere–sessions, background tasks, cache data, and any other delayed or reusable data are likely better held elsewhere.

Avoid work you don’t need to do

It can be challenging to spot when there is unnecessary work you’re doing.

Sometimes it’s obvious, but often it's not. For example, while it’s a perfectly reasonable architecture decision to manage sessions with Redis/Valkey, you could potentially eliminate that extra service and network hop entirely by using JSON Web Tokens and having the client send you the encrypted token with all the session data already included.

Consider your architectural decisions carefully. There’s often work or complexity you could be omitting or pushing to somewhere else less costly.

Do one thing well

While the concept of Single Responsibility Principle is known for its relation to object-oriented programming, its purpose can apply equally to functional code.

Code should be broken down into foundational components which, in an abstract sense, do only one task. In doing so, the performance profile of those functions becomes more clearly measurable, and the boundaries which delineate code with poor performance become much clearer when using performance analysis tools such as profilers.

Functions which do less, branch less. Branching is a major source of cost uncertainty. Single-responsibility functions are easier to test and easier to understand the full sequence of what they do. Undoubtedly you will need branches and complexity in some places, but if you break the complexity down into small, digestible components, you can not only isolate complexity and make cost more coherent, but you can also make your system more flexible and composable by reusing these smaller functions in different ways. You can compose the more complex functions from smaller ones handling each of the steps.

It helps to think in terms of how pieces of the data model need to change as it flows through the system. Small functions can be built to apply very targeted transformations at each of these places. This makes your data model highly reusable and malleable, and you end up with many tiny functions that are still useful when you reuse those segments of data in other places.

Wrapping up

By prioritizing efficiency, you can significantly enhance the performance and scalability of your Node.js applications. By effectively leveraging caching, carefully considering your architectural decisions, avoiding holding state and applying the single responsibility principle, you can optimize resource utilization, minimize latency, and deliver exceptional user experiences.

Remember, efficiency is an ongoing journey. Continuously evaluate your application's performance, identify bottlenecks, and implement optimizations to ensure it remains performant as your user base and complexity grow.

Want to discover how Platformatic's can help you auto-scale your Node.js applications, monitor key Node.js metrics, and minimize application downtime risk? Check out our Command Center.

Building a reliable Node.js application | Part II

Table of contents