NodeJS Timeouts: The I'm Still Working, Just Not Talking to You Problem

David OluwatobiDavid Oluwatobi
8 min read

One of the courses I found quite interesting in my CS education is Computer Performance Systems Evaluation. It is largely concerned with the measurement of systems metrics, one of which is CPU utilization. But honestly, it's about the bigger picture, how effectively our systems leverage all their critical resources: CPU, memory, network bandwidth, and I/O capabilities. As software engineers, we are all familiar with this concept of resource utilization, and by extension, the critical need for load shedding.

Under heavy server load, simple I/Os can become heavy I/Os. This means every code request can end up becoming a hot path: a bottleneck that degrades overall performance and responsiveness until it completely overwhelms the server. Left unchecked, these paths pile up, degrade throughput, and eventually choke the server. This is exactly where load shedding becomes essential: selectively dropping excess or expensive work to protect the system from cascading failure and keep the core responsive.

While there are several sophisticated methods and patterns to carry out load shedding, timeouts remain one of the simplest and most effective way to do so. By setting a limit on how long a client request can be processed by the server, this helps to prevent resource hogging and gives the system a chance to recover under load. Instead of letting everything slow down or fail together, timeouts aim to sacrifice individual slow requests so that the rest of the system can stay responsive.

Timeouts, in my view, serve three purposes:

  1. Control resource usage – don’t let any one request hog CPU, memory, or I/O for too long.

  2. Kill lingering connections – drop sockets that stay open too long, whether due to bugs or malicious intent.

  3. Fail fast – give clients early feedback instead of making them wait forever.

But as an engineer who is focused more on server related activities than client side, I care more about protecting the server than catering to the client. Timeouts, from my perspective, are there to defend system health: to cut off requests that risk dragging the server down. They’re not about being polite to the client. Any well built client should handle its own timeout logic. That’s its responsibility, not the server’s. The server’s responsibility is simple: stay online.

Which is why I went on a full blown rant recently after discovering how NodeJS and Express handles timeouts. It’s basically: “I promised to stop responding, not to stop working.” It closes the connection (✅ point 2), stops responding to the client (✅ point 3), but keeps hogging CPU and memory like nothing happened (❌ point 1). That’s not load shedding. That’s a masked resource leakage. From my point of view, apart from the socket disconnection (point 2), you're honestly better off not setting a timeout at all. At least then you’re not pretending the system is defending itself when it clearly isn’t.

Consider this piece of random code

// route.ts
app.use((req, res, next) => {
      res.setTimeout(5000 , () => {
        console.log("I have timed out")

        res.status(ERROR_STATUS_CODES[ERROR_TYPE_ENUM.REQUEST_TIMEOUT]).json({error: {
          message: ERROR_TYPE_DEFAULTS[ERROR_TYPE_ENUM.REQUEST_TIMEOUT],
          type: ERROR_TYPE_ENUM.REQUEST_TIMEOUT
        }});
      });
      next()
    });

// random-service.ts
//....random code
console.log("Sleeping for 5s")
await new Promise(resolve => setTimeout(resolve, 5000))
console.log("Timed out but I am still processing the request")

// More processing logic follows

// console.log

// Sleeping for 5s
// I have timed out
// Timed out but I am still processing the request

In fact, if you don’t check whether the headers have already been sent before trying to return a response from random-service.ts, you’ll get hit with: Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client. At that point, the connection is already closed, but your logic is still blindly trying to respond. Classic case of wasted work and broken flow.

Because Node runs on a single thread, there’s no inbuilt way to force stop or sudo kill an ongoing computation. Unlike Go, where you can pass a context.Context and cleanly cancel a goroutine, Node gives you nothing. Once execution begins, you’re locked in. There’s no interrupt, no preemption, just a long ride until the function completes and if the function translates into a hot path? Well, too bad, you just have to ride it out.

There are, however, two nodejs defined methods to mitigate this. And to be honest, I find both of them horrible for my use case.

AbortController — you spin one up to make your services signal aware, allowing them to check if ongoing work should be aborted. But in a real world code base especially a layered architecture where services are decoupled from HTTP concerns, this quickly becomes unsustainable. You’d have to spin up an abort controller in every controller, pass the signal down to each service call, and periodically check if it’s still valid. That’s boilerplate all over the place. Worse, you burn CPU on unnecessary checks because realistically, 98 - 99% of the time (assumption), your server isn’t under any real load. AbortController only starts to make sense when your server is acting like a peer, calling out to other servers and needing to cancel outbound requests. For internal logic, though, it’s clunky, noisy, and just adds unnecessary code surface. You're wiring in complexity to solve a problem that mostly doesn’t exist.

// controller.ts
export async function randomController(req, res) {
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), 5000);

  try {
    const data = await randomService(req.params.id, { signal: controller.signal });
    res.json(data);
  } catch (err) {
    if (controller.signal.aborted) {
      res.status(503).json({ error: "Request timed out" });
    } else {
      res.status(500).json({ error: "Internal error" });
    }
  } finally {
    clearTimeout(timeout);
  }
}

// service.ts
export async function randomService(id: string, opts?: { signal?: AbortSignal }) {
  const random1 = await dbCall(random-parmas);
  if (opts?.signal?.aborted) throw new Error("Aborted");

  const random2 = await dbCall(random-params);
  if (opts?.signal?.aborted) throw new Error("Aborted");

  return { random1, random2 };
}

Worker Threads — the second option Node gives you is to move heavy computations off the main thread entirely. You spin up a separate thread, send work to it, and if things go sideways, you can kill it with .terminate(). But again, in a typical server setup, this is just not practical. You’re not going to wrap every service call in a worker thread. That’s not architecture, that’s chaos. It’s overkill, over-engineering, armageddon, abyss, torture, hell... pick your poison. I call it losing a limb to save a finger.

In addition to these two, my brain has tried out all sorts of horrible patterns: wrapping service calls in race conditions, chaining nested Promise.race() calls with timers. Madness. It felt like surrounding a code base with land mines, just waiting for something to explode in production. It shouldn't be this hard to tell the server: "If you're choking, drop the damn request."

My brain has finally concluded (yes, pretty opinionated, but based on my use case) that no reasonable solution exists within the boundaries of individual request paths. Trying to handle load shedding deep inside service logic is a losing game. So what’s next? Obviously, something even worse.

What I need is a gateway or at least something that behaves like a gateway sitting just ahead of the route layer, still within the app. It watches traffic, monitors system health, and makes hard decisions before requests touch any business logic. Something ruthless enough to say: “No, the server’s too busy. Try again later.” And yes, I’m fully aware this will probably spike my boss’s blood pressure when he sees it but hey, something to remember me by.

It can even evolve into a ruthless bouncer, one that checks how fast a given IP is making requests. A basic rate limiter at first, but with just enough brains to start flagging patterns: an IP making an absurd number of requests in a short time? 🚩 Might be DoS. Multiple IPs using the same authentication token and hammering the server? 🚩 Probably a brute force attempt or some sketchy automation. This thing doesn’t need to be smart, it just needs to be aggressive, paranoid, and fast. Block first, think later. Because personally I believe protecting the server matters more than being polite.

I tried multiple iterations of this gateway. One version monitored CPU usage to decide when to shed load. What did I learn? CPU usage is a terrible metric for NodeJs. Node runs JavaScript on a single thread, meaning only one CPU core does the actual work, regardless of how many cores your machine has. That alone makes per core CPU usage stats misleading.

One version of the gateway compared total idle vs active time per core since boot but that data is cumulative, so if the system has been idle for hours and suddenly spikes, the usage still looks low. Totally useless in real time. Another version tried to catch short CPU bursts using instantaneous sampling. That also failed. During autocannon tests, I sent thousands of requests. I saw dozens of them (around 80) getting dropped, even though average latency was only 480ms and the system wasn’t actually struggling. Bottom line: CPU metrics don't tell you when Node is choking. They tell you how busy the system is not how responsive the JavaScript engine is. And in a single threaded world like Node, that’s the only thing that really matters.

Eventually, I ditched the CPU based attempts and tried out a version that measured event loop lag. Unlike CPU metrics, event loop lag directly measures whether Node can keep up with its workload. It doesn't care about system load averages or whether other cores are busy it tracks the thing that actually runs your JavaScript: the event loop. I used perf_hooks.monitorEventLoopDelay() to measure how far behind the loop was during a 100ms sampling window (adds 100ms , absolutely not recommended for production; should best be run globally outside the request paths). If the delay exceeded 50ms, the gateway would reject the request with a 503. Under autocannon load tests, it only rejected requests when Node was genuinely under pressure when blocking code, or high concurrency actually impacted responsiveness.

app.use(async (req, res, next) => {
      const h = monitorEventLoopDelay({ resolution: 10 });
      h.enable();
      // terrible artificial delay
      await new Promise(r => setTimeout(r, 100));
      h.disable();

      const lag = h.mean / 1e6; 

      if (lag > 50) {
        res.status(503).json({ error: "Server under load." });
        return;
      }

      next();
    });
0
Subscribe to my newsletter

Read articles from David Oluwatobi directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

David Oluwatobi
David Oluwatobi

A software engineer struggling to keep up with his writing schedule