Optimizing Node.js Performance: V8 Memory Management & GC Tuning

Matteo CollinaMatteo Collina
12 min read

A common observation for Node.js developers is the seemingly continuous growth of their application's memory footprint, often measured by the Resident Set Size (RSS) reported by the operating system. This increasing RSS frequently leads to concerns about memory leaks. As a result, many production environments configure monitoring and orchestration tools, like Kubernetes or Docker Swarm, to automatically restart or terminate Node.js processes when their RSS surpasses a certain percentage of the allocated memory limit, often around 80%. This is done assuming high RSS equates to a critical memory problem requiring intervention.

However, it's crucial to understand that high RSS in a Node.js application does not automatically signify a memory leak in the conventional sense. V8, the underlying JavaScript engine, employs sophisticated memory management strategies focused on performance optimization. One key aspect is its tendency to retain memory segments acquired from the operating system, even after the JavaScript objects within those segments have become garbage. V8 holds onto this memory proactively, anticipating future allocation needs, thereby minimizing the performance cost of frequent memory requests and releases to the OS. This reserved but not actively used memory contributes to the overall RSS.

A real memory leak involves unreachable JavaScript objects that the garbage collector consistently fails to reclaim, leading to an unbounded increase in the memory actively used by the application's heap over time. A high but stable RSS, or an RSS that grows but periodically decreases after the major garbage collection cycle, indicates that V8 is managing its memory pool effectively for the given workload. Relying solely on RSS as an indicator for process termination can be misleading and may result in killing healthy applications. Accurate diagnosis requires inspecting V8's internal heap statistics (like heapUsed vs. heapTotal) to differentiate between V8's memory management and actual leaks where active heap usage grows indefinitely.

Understanding V8's Generational Garbage Collector

The V8 engine's garbage collector is built upon the "generational hypothesis," a widely accepted principle in garbage collection theory. This hypothesis posits that most objects a program allocates become garbage shortly after creation ("die young"). In contrast, objects that persist beyond an initial period tend to remain alive for much longer. V8 organizes its memory heap into distinct generations to capitalize on this behavior, primarily the New Space (Young Generation) and the Old Space.

All new JavaScript objects are initially allocated within the New Space. This area is kept relatively small and is optimized for frequent, high-speed garbage collection using an algorithm called "Scavenge." Scavenge divides the New Space into two equal "semi-spaces." Objects are allocated into one semi-space until it fills. At that point, a Scavenge cycle begins: V8 rapidly identifies live objects in the filled semi-space by traversing reachable object references and copies these live objects into the second, currently empty, semi-space. After the copy, the first semi-space (containing only garbage) is completely cleared, and the roles of the semi-spaces are swapped. This fast "copying collection" mechanism is highly efficient when most objects are garbage, but necessitates that the New Space reserves twice the memory of its active allocation area.

Objects that endure through a couple of these rapid Scavenge cycles (typically two) are deemed likely to be long-lived. These surviving objects are then "promoted" – moved from the New Space into the significantly larger Old Space. The Old Space is intended for objects with longer lifecycles. Garbage collection in the Old Space is performed less frequently because it is more time-consuming. It primarily uses a "Mark & Sweep" algorithm: V8 traverses the entire object graph to mark all objects still reachable from the application's roots. Then, during the sweep phase, memory occupied by unmarked (garbage) objects is reclaimed. Optionally, V8 may perform a "Compaction" phase, rearranging the remaining live objects to reduce memory fragmentation. While compaction can lead to memory being returned to the operating system, V8 often retains this compacted space to optimize future Old Space allocations.

The Performance Pitfall: Premature Promotion

Despite its efficiency, the generational garbage collection strategy can sometimes lead to performance degradation, particularly under specific application workloads. Applications that exhibit very high rates of temporary object allocation, such as those heavily involved in complex data transformations, string manipulations, or especially server-side rendering (SSR) using frameworks like React or Next.js, are susceptible. During SSR, for instance, rendering a single complex page might create and quickly discard millions of short-lived objects.

The performance issue arises when the New Space, designed to be small for fast collection, fills up more rapidly than the Scavenge collector can process it. If allocations outpace collections significantly, objects that logically become garbage almost instantly might still be present during one or two Scavenge cycles simply because the collector didn't run frequently enough or quickly enough relative to the allocation rate. Though intended for a brief existence, these objects survive the initial GC cycles.

Having survived the requisite number of Scavenges (often just two), these essentially temporary objects are mistakenly classified as potentially long-lived and are promoted to the Old Space. This "premature promotion" clutters the Old Space with objects that will likely become garbage shortly after arriving there. As the Old Space fills disproportionately with this short-lived garbage, V8 is forced to initiate its slower, more resource-intensive mark-and-sweep collections much more often than necessary if the garbage had been collected efficiently in the New Space. These frequent, longer pauses for Old Space GC directly increase request latency and reduce the application's overall capacity to handle concurrent requests, negatively impacting performance.

Tuning V8: Configuring

To address the performance challenges caused by premature promotion, particularly in applications with high allocation churn, Node.js developers can directly influence V8's garbage collector behavior by tuning the size of the Young Generation (New Space). V8 exposes command-line flags for this purpose, with the most impactful one often being --max-semi-space-size.

This flag allows you to specify the maximum size, in megabytes, for each of the two semi-spaces within the New Space. For example, launching a Node.js application with node --max-semi-space-size=64 index.js instructs V8 to allow each semi-space to grow up to 64 MB. Since only one semi-space is active for allocation at a time while the other is reserved for the next Scavenge copy, the Young Generation might occupy up to 128 MB of reserved heap space, but its active allocation limit before a Scavenge is triggered is 64 MB. Appropriate values depend heavily on the application's specific memory allocation characteristics and the total available system memory, often ranging from 16MB to 256MB per semi-space.

By increasing the --max-semi-space-size, you provide a larger buffer for newly allocated objects. This extension gives the rapid Scavenge collector more time and opportunity to identify and reclaim short-lived garbage before the allocation space becomes full and forces promotion based on survival count rather than actual longevity. A well-tuned, larger New Space significantly reduces the rate of premature promotion to the Old Space. Consequently, the frequency of the slow and disruptive Old Space GC cycles decreases, minimizing application pauses and leading to improved latency and throughput. Determining the ideal size typically involves profiling and benchmarking the application under representative load conditions.

The Strategic Trade-off: Memory for Compute Performance

Modifying the --max-semi-space-size setting represents a conscious and often beneficial performance tuning decision that involves trading memory resources for computational efficiency. When you increase the size of the Young Generation's semi-spaces, you are explicitly allowing the Node.js process, via V8, to reserve and potentially utilize a larger amount of physical memory (which increases its RSS). This increased memory footprint is the "cost" of the tuning.

The "benefit" gained from this increased memory usage is reduced CPU time and application pause time consumed by garbage collection. By enabling the faster Scavenge collector in the larger New Space to handle a more significant proportion of the garbage (short-lived objects), the frequency of invoking the slower and more disruptive Mark & Sweep/Compact cycles on the Old Space is reduced. These Old Space collections are a primary source of noticeable pauses in application execution that negatively affect latency and throughput.

In many production environments, especially within cloud infrastructure or containerized deployments, RAM is often considered a more abundant and less expensive resource compared to CPU cycles or the business impact of poor application responsiveness. Therefore, deliberately allocating more memory to the Young Generation to substantially reduce GC overhead, resulting in lower latency and higher request handling capacity, is frequently a sound engineering and economic trade-off. It requires analyzing the application's behavior and infrastructure costs, but can yield significant performance improvements by optimizing the balance between memory consumption and computational work.

Node v22+ Defaults and Low-Memory Considerations

An essential nuance in V8's memory management emerged around the Node.js v22 release cycle concerning how the default size for the New Space semi-spaces is determined. Unlike some earlier versions with more static defaults, newer V8 versions incorporate heuristics that attempt to set this default size dynamically, often based on the total amount of memory perceived as available to the Node.js process when it starts. The intention is to provide sensible defaults across different hardware configurations without manual tuning.

While this dynamic approach may perform adequately on systems with large amounts of RAM, it can lead to suboptimal or even poor performance in environments where the Node.js process is strictly memory-constrained. This is highly relevant for applications deployed in containers (like Docker on Kubernetes) or serverless platforms (like AWS Lambda or Google Cloud Functions) where memory limits are often set relatively low (e.g., 512MB, 1GB, 2GB). In such scenarios, V8's dynamic calculation might result in an unexpectedly small default --max-semi-space-size, sometimes as low as 1 MB or 8 MB.

As explained earlier, a severely undersized Young Generation drastically increases the probability of premature promotion. Even moderate allocation rates can quickly fill the tiny semi-spaces, forcing frequent promotions and consequently triggering the slow Old Space GC far too often. This results in significant performance degradation compared to what might be expected or what was observed with older Node.js versions under the same memory limit. Therefore, for applications running on Node.js v22 or later within memory-limited contexts, relying solely on the default V8 settings for semi-space size is generally discouraged. Developers should strongly consider profiling their application and explicitly setting the --max-semi-space-size flag to a value that works well for their allocation patterns within the given memory constraints (e.g., 16MB, 32MB, 64MB, etc.), thereby ensuring the Young Generation is adequately sized for efficient garbage collection.

Memory tuning in practice

Tuning your Node.js server to improve the garbage collection performance is simple but hard in practice. Specifically, we want a full request/response cycle to be completely garbage collected by Scavenge, with no object being promoted to old space:

In practice, this turns out to be harder due to the evented nature of Node.js: multiple requests are processed concurrently. In this example, the peak memory consumption is a function of the number of concurrent requests:

To tune the memory for this application, we should assess its concurrency and the average allocation cost. Note that a factor is the latency added by the database (or external service), which increases the concurrency and, therefore, the memory consumption. Given that the Garbage Collector also consumes CPU time, an unexpected consequence is that an increase in latency in a dependency would result in an increase in CPU usage in my Node.js application.

Based on some napkin calculations, a complex React website would allocate around 3000 objects to render a page on the server. This would result in 5-10MB in memory. In this case, I recommend tuning the semi-space to 64MB or setting the YoungGeneration to 128 MB.

Let’s consider an example using our Watt application server for Node.js, which has three services running within the same process: a Next.js SSR application, a Fastify server, and a barebone Node.js app. The source code is available at https://github.com/platformatic/composer-next-node-fastify.

Let’s run some benchmarks in the standard configuration:

➜ autocannon -c 100 -d 30 http://localhost:3042/next  
Running 30s test @ http://localhost:3042/next100 connections

┌─────────┬───────┬───────┬────────┬────────┬──────────┬──────────┬─────────┐  
│ Stat    │ 2.5%  │ 50%   │ 97.5%  │ 99%    │ Avg      │ Stdev    │ Max     │  
├─────────┼───────┼───────┼────────┼────────┼──────────┼──────────┼─────────┤  
│ Latency │ 77 ms │ 83 ms │ 107 ms │ 116 ms │ 86.43 ms │ 27.86 ms │ 1094 ms │  
└─────────┴───────┴───────┴────────┴────────┴──────────┴──────────┴─────────┘  
┌───────────┬─────────┬─────────┬─────────┬─────────┬──────────┬────────┬─────────┐  
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg      │ Stdev  │ Min     │  
├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼────────┼─────────┤  
│ Req/Sec   │ 651     │ 651     │ 1,166   │ 1,224   │ 1,149.14 │ 106.74 │ 651     │  
├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼────────┼─────────┤  
│ Bytes/Sec │ 4.62 MB │ 4.62 MB │ 8.28 MB │ 8.68 MB │ 8.15 MB  │ 758 kB │ 4.61 MB │  
└───────────┴─────────┴─────────┴─────────┴─────────┴──────────┴────────┴─────────┘

Req/Bytes counts sampled once per second.  
# of samples: 30

35k requests in 30.03s, 245 MB read

Thanks to Watt, our Node.js application server, tuning the young generation size is as straightforward as changing the health settings in watt.json:

{  
  …  
  "health": {
    "maxYoungGeneration": 134217728  
  },  
  …  
}

This will set it to 128 MB. Rerunning our server (`npm run start-big`) and benchmarks, will result in this data:

➜ autocannon -c 100 -d 30 http://localhost:3042/next  
Running 30s test @ http://localhost:3042/next  
100 connections

┌─────────┬───────┬───────┬───────┬────────┬──────────┬──────────┬─────────┐  
│ Stat    │ 2.5%  │ 50%   │ 97.5% │ 99%    │ Avg      │ Stdev    │ Max     │  
├─────────┼───────┼───────┼───────┼────────┼──────────┼──────────┼─────────┤  
│ Latency │ 75 ms │ 79 ms │ 98 ms │ 105 ms │ 82.16 ms │ 26.28 ms │ 1055 ms │  
└─────────┴───────┴───────┴───────┴────────┴──────────┴──────────┴─────────┘  
┌───────────┬─────────┬─────────┬─────────┬─────────┬──────────┬────────┬─────────┐  
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg      │ Stdev  │ Min     │  
├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼────────┼─────────┤  
│ Req/Sec   │ 689     │ 689     │ 1,238   │ 1,292   │ 1,207.87 │ 110.62 │ 689     │  
├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼────────┼─────────┤  
│ Bytes/Sec │ 4.89 MB │ 4.89 MB │ 8.78 MB │ 9.17 MB │ 8.57 MB  │ 786 kB │ 4.88 MB │  
└───────────┴─────────┴─────────┴─────────┴─────────┴──────────┴────────┴─────────┘

Req/Bytes counts sampled once per second.  
# of samples: 30
36k requests in 30.02s, 257 MB read

In our tests, simply changing this configuration lowered the P99 latency by 5%, improved the number of req/s by 7%, and improved the overall throughput by 7%. Make sure to benchmark your application, as your mileage may vary.

Next steps

This blog post explores the complexities of V8's memory management in Node.js applications, emphasizing that high Resident Set Size (RSS) does not necessarily indicate a memory leak. It explains V8's generational garbage collection, including the New Space (Young Generation) and Old Space, and how premature promotion of objects can negatively impact performance. Tuning V8 using the --max-semi-space-size flag to adjust the Young Generation's size is recommended to balance memory usage and computational efficiency, particularly in memory-constrained environments like containers or serverless platforms. The post also highlights the trade-off between memory and CPU usage, suggesting that allocating more memory can reduce garbage collection overhead and improve application latency and throughput.

If you need assistance getting your Node.js applications ready for production, we're happy to help. We have expertise in addressing everyday challenges such as event loop blocking, V8 memory management, and setting up effective monitoring, tracing, and Kubernetes deployments. Please reach out if you'd like guidance in navigating the complexities of Node.js in production and ensuring your applications run smoothly.

References:

Watch my talk at dotJS 2025:

There are also a few interesting blog posts in the V8 blog:

5
Subscribe to my newsletter

Read articles from Matteo Collina directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Matteo Collina
Matteo Collina