Node.js Performance #3: Spread operator, destructuring and Garbage Collection

Intro

Memory management is a critical aspect of any programming language, and Node.js is no exception. While the V8 engine handles most memory-related tasks automatically through its garbage collection mechanism, certain JavaScript patterns like the spread operator and destructuring can significantly impact memory usage and performance.

In this article, we'll explore how the spread operator and destructuring can lead to excessive object creation and memory pollution, and examine how garbage collection in Node.js works to clean up these objects. We'll look at how memory is allocated and freed, what triggers collection cycles, and how this impacts your application's performance.

Before we dive in, let's discuss why understanding both the memory implications of these JavaScript features and garbage collection is crucial for building efficient Node.js applications.

What are Spread Operator and Destructuring?

The spread operator (...) and destructuring are powerful ES6 features that make working with arrays and objects more elegant and concise.

Spread Operator

The spread operator expands iterables into individual elements. It provides a concise way to:

  • Copy arrays or objects without reference issues

  • Combine multiple arrays or objects

  • Convert iterables like strings into arrays

  • Pass multiple arguments to functions

For example, combining arrays:

const array1 = [1, 2, 3];
const array2 = [4, 5, 6];
const combinedArray = [...array1, ...array2]; // [1, 2, 3, 4, 5, 6]

Or creating an object copy with additional properties:

const original = { x: 1, y: 2 };
const copy = { ...original, z: 3 }; // { x: 1, y: 2, z: 3 }

Destructuring

Destructuring allows you to extract values from arrays or properties from objects into distinct variables:

Array destructuring:

const [first, second] = [1, 2]; 
// first = 1, second = 2

Object destructuring:

const { name, age } = { name: 'John', age: 30 };
// name = 'John', age = 30

While these features offer elegant syntax and improved readability, they can have performance implications, especially when working with large data structures, as we'll explore later in this article.

What is Garbage Collection?

So, imagine we're software engineers (C language) from the '80s and we're working on some kind of scientific software that can read a file with numbers and do some computations on the data from the file and print out results. To simplify our software, let's assume that we read an array of numbers from a file and then calculate their sum and average. We can define an array like this:

int array[256];
int size;

Looks fine, but what if there are more than 256 elements? No problem, we can increase the number more and more. But we need to recompile the code each time when we're increasing the array size. That's not efficient. Also, when we declare variables in C in such a way, they are stored in the stack.

The stack is a region of memory that follows a Last-In-First-Out (LIFO) order and is used for storing local variables, function parameters, and managing function call frames. It's automatically managed by the program as functions are called and return. The size of the stack is fixed and very small (for some OS it can be only 1 MB even nowadays). Unfortunately, for huge arrays we can easily overflow the stack, so we need to have another way to store our data. Luckily, there is heap for such kind of purposes.

The heap is a region of memory used for dynamic memory allocation, where data can be allocated and freed in any order. Unlike the stack, the heap's size is limited only by the available system memory. When we allocate memory on the heap, we get a pointer to that memory location, and we're responsible for managing (allocating and freeing) that memory ourselves.

The heap provides much more flexibility than the stack because:

  • Memory can be allocated and deallocated in any order

  • The size of allocated memory can be determined at runtime

  • Memory can persist beyond the scope of the current function

However, this flexibility comes with responsibility - in languages like C, we must carefully manage heap memory to avoid leaks and other memory-related bugs.

Let's look at a simple C program that demonstrates manual memory management using heap while processing numbers from a file:

#include <stdio.h>
#include <stdlib.h>

int main() {
    FILE *file = fopen("numbers.txt", "r");
    if (file == NULL) {
        printf("Error opening file\\n");
        return 1;
    }

    // Allocate initial memory for numbers
    int capacity = 10;
    int *numbers = (int*)malloc(capacity * sizeof(int));
    int count = 0;
    int number;
    int sum = 0;

    // Read numbers from file
    while (fscanf(file, "%d", &number) == 1) {
        if (count >= capacity) {
            // Need more space - reallocate
            capacity *= 2;
            numbers = (int*)realloc(numbers, capacity * sizeof(int));
        }
        numbers[count++] = number;
        sum += number;
    }

    // Calculate average
    double average = (double)sum / count;

    printf("Sum: %d\\n", sum);
    printf("Average: %.2f\\n", average);

    // Clean up
    free(numbers);
    fclose(file);

    return 0;
}

In this example, we need to explicitly:

  • Allocate memory for our array using malloc()

  • Reallocate memory when we need more space using realloc()

  • Free the memory when we're done using free()

If we forget to free the memory, we create a memory leak. If we try to access memory after freeing it, we get undefined behavior. Moreover, this undefined behavior can be used by hackers to execute malicious code.

💡
Buffer overflow is a very popular vulnerability and since V8 (JS engine for Node.JS) is also used in Chrome browser there is a ton of already fixed buffer overflows. For more details just simply google “v8 buffer overflow CVE” and you’ll find a lot of security bulletins for the vulnerability.

This sounds like a huge risk for creating broken programs, and we need some kind of mechanism that allows us to reduce this risk. This is where Garbage Collection (GC) comes into play.

Garbage Collection is an automatic memory management system that tracks memory allocation and identifies when allocated memory is no longer needed, automatically freeing it for reuse. This eliminates the need for manual memory management, reducing the risk of memory leaks and other memory-related bugs.

The basic principle of garbage collection is simple: it identifies memory that is no longer "reachable" or "referenced" by the running program and frees it. This means that developers don't need to explicitly free memory - the garbage collector will handle it automatically when the memory is no longer needed.

There are several different strategies for garbage collection:

  • Tracing

  • Reference counting

But, as usual, nothing is perfect in this world, and there are some trade-offs:

  • Some performance overhead due to the GC process running periodically

  • Less predictable memory usage patterns compared to manual memory management

To sum up, now we know how dynamic memory management works, what problems it exposes, and how we can minimize the problems with GC.

Let's move on.

What can go wrong?

While spread operators and destructuring enhance code readability and elegance, they can significantly impact performance when used incorrectly, especially in performance-critical sections of your code. Let's examine some common patterns that can lead to memory issues and degraded performance.

💡
The benchmark was run on Node.js v22.14.0 on a MacBook Pro with an M1 Pro (10 cores).

The source code for the benchmarking can be found here.

Destructuring in cycles

// BAD: Destructuring creates temporary bindings/references in loops
function badCalculateReportsSum(reports) {
    let sum = 0;

    for (let i = 0; i < reports.length; i++) {
        const { amountA, amountB, amountC } = reports[i];
        sum += amountA + amountB + amountC;
    }

    return sum;
}

// Or highload calls
function calculateSum(report) {
    const { amountA, amountB, amountC } = report;
    const sum = amountA + amountB + amountC;

    return sum;
}

Fixed solution:

// GOOD: Don't use destructuring at all
function goodCalculateReportsSum(reports) {
    let sum = 0;

    for (let i = 0; i < reports.length; i++) {
        const report = reports[i];
        sum += report.amountA + report.amountB + report.amountC;
    }

    return sum;
}

function calculateSum(report) {
    return report.amountA + report.amountB + report.amountC;
}

Let's look at some performance metrics (100k elements in the array):

CPUMemory
badCalculateReportsSum0.41 ms108.5 KB
goodCalculateReportsSum0.40 ms3.48 KB
delta~same~31.2x more memory

As we can see, this small change reduces memory usage by ~31.2 times.

Spread operator

When using spread operators in loops, it's important to be aware of how each iteration can create new object allocations, leading to increased memory usage and potential performance issues. This can be particularly problematic in high-frequency operations what includes array/object concatenations.

// BAD: Excessive allocations in loops
function badArrayConcatenation(arrays) {
    let result = [];

    // Each spread operation creates a new array, copying all existing elements
    for (let i = 0; i < arrays.length; i++) {
        result = [...result, ...arrays[i]]; // O(n) allocation every iteration!
    }

    return result;
}

Fixed solution:

// GOOD: Efficient concatenation
function goodArrayConcatenation(arrays) {
    let result = [];

    // Push elements directly - no unnecessary copying
    for (let i = 0; i < arrays.length; i++) {
        result.push(...arrays[i]); // Only spreads the current array
    }

    return result;
}

Let's look at some performance metrics (1k elements in the array):

CPUMemory
badArrayConcatenation5.74 ms8.8 MB
goodArrayConcatenation0.08 ms246.96 KB
delta~74.4x slower~36.5x more memory

The difference in results is dramatically significant. The most concerning aspect is that I frequently encounter code generated by LLMs that contains this issue. For example, Claude Sonnet 4 generated a reduce method with copying array on each iteration:

accumulateAmounts(amounts: BN[]): BN[] {
  return amounts.reduce<BN[]>((acc, amount) => {
    const sum = acc.length === 0 ? amount : amount.add(acc[acc.length - 1]);
    return [...acc, sum];
  }, []);
}

Also, when working with large arrays, the spread operator contains a hidden and very important limitation:

const bigArr = Array.from({ length: 1_000_000 }, (_, i) => i + 1);

// This will throw an error
Math.min(...bigArr);

This code will throw an error because JavaScript functions have a limit on the number of arguments they can receive. When using the spread operator to pass array elements as individual arguments, there's a maximum number of arguments that can be handled.

The error you'll see is:

RangeError: Maximum call stack size exceeded

This happens because:

  • The spread operator attempts to convert each array element into a separate function argument

  • Function arguments are pushed onto the stack, which has a fixed size limit

  • With 1 million elements, this quickly exceeds the stack size limit, causing the "Maximum call stack size exceeded" error

For large arrays, use methods that don't require spreading the entire array, such as:

// Instead of Math.min(...bigArr), use:
const min = bigArr.reduce((min, val) => Math.min(min, val), Infinity);

// Or a simple loop
let min = Infinity;
for (let i = 0; i < bigArr.length; i++) {
  if (bigArr[i] < min) min = bigArr[i];
}

This limitation is another reason to be cautious with the spread operator when working with data of unknown or potentially large size.

Is the Situation Really That Bad?

Let’s take a look at a more complicated example:

function badParameterDestructuring({ a, b, c, d, e, f, g, h }) {
    return {
        a: a * 2,
        b: b + 10,
        c: c.toUpperCase(),
        d: d || 'default',
        e, f, g, h
    };
}

function goodParameterAccess(data) {
    return {
        a: data.a * 2,
        b: data.b + 10,
        c: data.c.toUpperCase(),
        d: data.d || 'default',
        e: data.e,
        f: data.f,
        g: data.g,
        h: data.h
    };
}

It appears the bad implementation would consume significantly more memory, but the results reveal (100k elements in the array):

CPUMemory
badParameterDestructuring7.13 ms15.26 MB
goodParameterAccess7.76 ms15.26 MB
delta~0.1x slower~same

Whoa, memory consumption is the same! But why? If we take a look at the decompiled bytecode:

[generated bytecode for function: badParameterDestructuring (0x026cb9186909 <SharedFunctionInfo badParameterDestructuring>)]
Bytecode length: 125
Parameter count 2
Register count 13
Frame size 104
 2576 E> 0x3163b872f588 @    0 : 2f 03 00 00       GetNamedProperty a0, [0], [0]
         0x3163b872f58c @    4 : c9                Star0
 2579 E> 0x3163b872f58d @    5 : 2f 03 01 02       GetNamedProperty a0, [1], [2]
         0x3163b872f591 @    9 : c8                Star1
 2582 E> 0x3163b872f592 @   10 : 2f 03 02 04       GetNamedProperty a0, [2], [4]
         0x3163b872f596 @   14 : c7                Star2
 2585 E> 0x3163b872f597 @   15 : 2f 03 03 06       GetNamedProperty a0, [3], [6]
         0x3163b872f59b @   19 : c6                Star3
 2588 E> 0x3163b872f59c @   20 : 2f 03 04 08       GetNamedProperty a0, [4], [8]
         0x3163b872f5a0 @   24 : c5                Star4
 2591 E> 0x3163b872f5a1 @   25 : 2f 03 05 0a       GetNamedProperty a0, [5], [10]
         0x3163b872f5a5 @   29 : c4                Star5
 2594 E> 0x3163b872f5a6 @   30 : 2f 03 06 0c       GetNamedProperty a0, [6], [12]
         0x3163b872f5aa @   34 : c3                Star6
 2597 E> 0x3163b872f5ab @   35 : 2f 03 07 0e       GetNamedProperty a0, [7], [14]
         0x3163b872f5af @   39 : c2                Star7
 2646 S> 0x3163b872f5b0 @   40 : 0b f9             Ldar r0
 2648 E> 0x3163b872f5b2 @   42 : 49 02 10          MulSmi [2], [16]
         0x3163b872f5b5 @   45 : c1                Star8
 2674 S> 0x3163b872f5b6 @   46 : 0b f8             Ldar r1
 2676 E> 0x3163b872f5b8 @   48 : 47 0a 11          AddSmi [10], [17]
         0x3163b872f5bb @   51 : c0                Star9
 2705 S> 0x3163b872f5bc @   52 : 2f f7 08 12       GetNamedProperty r2, [8], [18]
         0x3163b872f5c0 @   56 : bd                Star12
 2705 E> 0x3163b872f5c1 @   57 : 60 ed f7 14       CallProperty0 r12, r2, [20]
         0x3163b872f5c5 @   61 : bf                Star10
 2741 S> 0x3163b872f5c6 @   62 : 0b f6             Ldar r3
         0x3163b872f5c8 @   64 : 9b 04             JumpIfToBooleanTrue [4] (0x3163b872f5cc @ 68)
         0x3163b872f5ca @   66 : 13 09             LdaConstant [9]
         0x3163b872f5cc @   68 : be                Star11
 2814 S> 0x3163b872f5cd @   69 : 81 0a 16 29       CreateObjectLiteral [10], [22], #41
         0x3163b872f5d1 @   73 : bd                Star12
         0x3163b872f5d2 @   74 : 0b f1             Ldar r8
 2830 E> 0x3163b872f5d4 @   76 : 36 ed 00 17       DefineNamedOwnProperty r12, [0], [23]
         0x3163b872f5d8 @   80 : 0b f0             Ldar r9
 2849 E> 0x3163b872f5da @   82 : 36 ed 01 19       DefineNamedOwnProperty r12, [1], [25]
         0x3163b872f5de @   86 : 0b ef             Ldar r10
 2868 E> 0x3163b872f5e0 @   88 : 36 ed 02 1b       DefineNamedOwnProperty r12, [2], [27]
         0x3163b872f5e4 @   92 : 0b ee             Ldar r11
 2887 E> 0x3163b872f5e6 @   94 : 36 ed 03 1d       DefineNamedOwnProperty r12, [3], [29]
         0x3163b872f5ea @   98 : 0b f5             Ldar r4
 2903 E> 0x3163b872f5ec @  100 : 36 ed 04 1f       DefineNamedOwnProperty r12, [4], [31]
         0x3163b872f5f0 @  104 : 0b f4             Ldar r5
 2910 E> 0x3163b872f5f2 @  106 : 36 ed 05 21       DefineNamedOwnProperty r12, [5], [33]
         0x3163b872f5f6 @  110 : 0b f3             Ldar r6
 2917 E> 0x3163b872f5f8 @  112 : 36 ed 06 23       DefineNamedOwnProperty r12, [6], [35]
         0x3163b872f5fc @  116 : 0b f2             Ldar r7
 2924 E> 0x3163b872f5fe @  118 : 36 ed 07 25       DefineNamedOwnProperty r12, [7], [37]
         0x3163b872f602 @  122 : 0b ed             Ldar r12
 2968 S> 0x3163b872f604 @  124 : ae                Return
Constant pool (size = 11)
0x3163b872f4e1: [TrustedFixedArray]
 - map: 0x22fb03b009a9 <Map(TRUSTED_FIXED_ARRAY_TYPE)>
 - length: 11
           0: 0x22fb03b04b11 <String[1]: #a>
           1: 0x22fb03b04b29 <String[1]: #b>
           2: 0x22fb03b04b41 <String[1]: #c>
           3: 0x22fb03b04b59 <String[1]: #d>
           4: 0x22fb03b04b71 <String[1]: #e>
           5: 0x22fb03b04b89 <String[1]: #f>
           6: 0x22fb03b04ba1 <String[1]: #g>
           7: 0x22fb03b04bb9 <String[1]: #h>
           8: 0x22fb03b0d7e1 <String[11]: #toUpperCase>
           9: 0x22fb03b07059 <String[7]: #default>
          10: 0x03f7b0a0d311 <ObjectBoilerplateDescription[16]>

And:

[generated bytecode for function: goodParameterAccess (0x30f650746961 <SharedFunctionInfo goodParameterAccess>)]
Bytecode length: 92
Parameter count 2
Register count 3
Frame size 24
 3075 S> 0x1953743b0188 @    0 : 81 00 00 29       CreateObjectLiteral [0], [0], #41
         0x1953743b018c @    4 : c9                Star0
 3096 E> 0x1953743b018d @    5 : 2f 03 01 02       GetNamedProperty a0, [1], [2]
 3098 E> 0x1953743b0191 @    9 : 49 02 01          MulSmi [2], [1]
         0x1953743b0194 @   12 : 36 f9 01 04       DefineNamedOwnProperty r0, [1], [4]
 3115 E> 0x1953743b0198 @   16 : 2f 03 02 07       GetNamedProperty a0, [2], [7]
 3117 E> 0x1953743b019c @   20 : 47 0a 06          AddSmi [10], [6]
         0x1953743b019f @   23 : 36 f9 02 09       DefineNamedOwnProperty r0, [2], [9]
 3135 E> 0x1953743b01a3 @   27 : 2f 03 03 0b       GetNamedProperty a0, [3], [11]
         0x1953743b01a7 @   31 : c7                Star2
 3137 E> 0x1953743b01a8 @   32 : 2f f7 04 0d       GetNamedProperty r2, [4], [13]
         0x1953743b01ac @   36 : c8                Star1
 3137 E> 0x1953743b01ad @   37 : 60 f8 f7 0f       CallProperty0 r1, r2, [15]
         0x1953743b01b1 @   41 : 36 f9 03 11       DefineNamedOwnProperty r0, [3], [17]
 3164 E> 0x1953743b01b5 @   45 : 2f 03 05 13       GetNamedProperty a0, [5], [19]
         0x1953743b01b9 @   49 : 9b 04             JumpIfToBooleanTrue [4] (0x1953743b01bd @ 53)
         0x1953743b01bb @   51 : 13 06             LdaConstant [6]
         0x1953743b01bd @   53 : 36 f9 05 15       DefineNamedOwnProperty r0, [5], [21]
 3192 E> 0x1953743b01c1 @   57 : 2f 03 07 17       GetNamedProperty a0, [7], [23]
         0x1953743b01c5 @   61 : 36 f9 07 19       DefineNamedOwnProperty r0, [7], [25]
 3207 E> 0x1953743b01c9 @   65 : 2f 03 08 1b       GetNamedProperty a0, [8], [27]
         0x1953743b01cd @   69 : 36 f9 08 1d       DefineNamedOwnProperty r0, [8], [29]
 3222 E> 0x1953743b01d1 @   73 : 2f 03 09 1f       GetNamedProperty a0, [9], [31]
         0x1953743b01d5 @   77 : 36 f9 09 21       DefineNamedOwnProperty r0, [9], [33]
 3237 E> 0x1953743b01d9 @   81 : 2f 03 0a 23       GetNamedProperty a0, [10], [35]
         0x1953743b01dd @   85 : 36 f9 0a 25       DefineNamedOwnProperty r0, [10], [37]
         0x1953743b01e1 @   89 : 0b f9             Ldar r0
 3243 S> 0x1953743b01e3 @   91 : ae                Return
Constant pool (size = 11)
0x1953743b00e1: [TrustedFixedArray]
 - map: 0x168f3a0409a9 <Map(TRUSTED_FIXED_ARRAY_TYPE)>
 - length: 11
           0: 0x2d851f98b441 <ObjectBoilerplateDescription[16]>
           1: 0x168f3a044b11 <String[1]: #a>
           2: 0x168f3a044b29 <String[1]: #b>
           3: 0x168f3a044b41 <String[1]: #c>
           4: 0x168f3a04d7e1 <String[11]: #toUpperCase>
           5: 0x168f3a044b59 <String[1]: #d>
           6: 0x168f3a047059 <String[7]: #default>
           7: 0x168f3a044b71 <String[1]: #e>
           8: 0x168f3a044b89 <String[1]: #f>
           9: 0x168f3a044ba1 <String[1]: #g>
          10: 0x168f3a044bb9 <String[1]: #h>

The second bytecode is clearly cleaner and smaller, but why does this happen?

Remember the first article of this series? There is a component (called TurboFan) that translates bytecode to machine code and can do some optimizations/deoptimizations. So, here is a case where TurboFan optimized the first bytecode. And this should be a message: Node.js knows how (and usually does) some kind of optimizations to run your code better and with fewer resource consumption.

What’s under the hood

V8 implements a generational garbage collection system. This means that objects are divided into different generations based on their age and are collected differently depending on their generation.

Generational Hypothesis

The generational hypothesis states that most objects die young. Based on this, V8 divides the heap into different generations:

  • Young Generation (New Space): Where new objects are allocated. It's small and designed for quick garbage collections.

  • Old Generation (Old Space): Where objects that survive several garbage collections in the young generation are moved.

Collection Types

V8 uses different types of garbage collection depending on the generation:

Scavenge Collection (Young Generation)

This is a fast and frequent collection that occurs in the young generation. It uses a copying collector algorithm called Cheney's algorithm:

  • The young generation is split into two equal spaces: "to-space" and "from-space"

  • New objects are allocated in "from-space"

  • During collection, live objects are copied to "to-space"

  • The spaces are then swapped

Mark-Sweep Collection (Old Generation)

This is a more thorough but slower collection that occurs in the old generation:

  • Marking Phase: V8 identifies and marks all reachable objects

  • Sweeping Phase: Unmarked objects are considered garbage and their memory is reclaimed

Mark-Compact Collection

This is a special type of collection that helps prevent memory fragmentation:

  • Similar to Mark-Sweep but includes an additional compaction step

  • Live objects are moved to make memory contiguous

  • Helps reduce memory fragmentation but is more expensive than Mark-Sweep

Optimization Strategies

V8 employs several strategies to optimize garbage collection:

  • Incremental Collection: Breaks up large collections into smaller chunks to reduce pause times

  • Concurrent Collection: Performs some collection work in parallel with JavaScript execution

  • Lazy Sweeping: Delays some cleanup work until new allocations are needed

Understanding these implementation details is crucial for optimizing Node.js applications, as it helps developers make informed decisions about object lifecycle management and memory usage patterns.

Summary

This article explored how spread operators and destructuring in JavaScript can impact memory usage and performance in Node.js (V8):

  • Destructuring in loops creates temporary objects that increase memory usage by up to 31x compared to direct property access

  • Spread operators for array concatenation in loops can be 74x slower and use 36x more memory than more efficient alternatives

  • LLM-generated code often contains these inefficient patterns like using spread operators for array operations

  • Avoid premature optimizations - Node.js and V8 often optimize code automatically (as seen in the parameter destructuring example)

  • V8's garbage collection system uses generational collection (young/old generations) with specialized algorithms for each

  • Understanding memory management helps you make better decisions about when to use these JavaScript features

The key takeaway: use spread operators and destructuring for readability, but be careful with them in performance-critical code, especially inside loops or when working with large data structures.

0
Subscribe to my newsletter

Read articles from Dmytro Svynarenko directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dmytro Svynarenko
Dmytro Svynarenko