Rate Limiting in Node.js & Express with Redis — Fixed & Sliding Window Explained

Harsh SharmaHarsh Sharma
15 min read

When you launch an API into the wild, traffic is both a blessing and a burden. It signals traction, real-world usage, maybe even growth. But it also comes with a less glamorous side — bots scraping your endpoints, users hammering your services with abusive request rates, and infrastructure bills climbing before your startup even gets out of beta.

In this guide, I’ll walk you through how I built a secure and performant rate-limited API gateway using Node.js, Express, and Redis — a real-world solution we used in production to keep our service alive during a high-traffic launch week.

  • Explain what Redis is and why it’s a great fit for rate limiting.

  • Implement two production-ready middlewares in Node.js/Express: Fixed Window and Sliding Window.

  • Walk through your code line‑by‑line (including ZSET commands like ZADD, ZCARD, ZCOUNT, ZREMRANGEBYSCORE) and why we use them.

  • Add real routes: /resend-otp and /verify-otp, including attempt throttling.

  • Show Redis connection (Cloud/managed Redis — no Docker) and with Docker.

  • Use Redis Insight to see keys.

  • Compare Redis with MongoDB/Postgres for this job.

  • Bonus: Add a /todos caching example with realistic timing results.


Table of Contents

  1. What is Redis? Why use it for rate limiting?

  2. When & where to apply rate limiting (real examples)

  3. Project setup & Redis connection (Cloud + Docker)

  4. Fixed Window rate limiter (simple, fast)

  5. Sliding Window rate limiter (fair, production-friendly)

  6. Line-by-line: every Redis Z* command explained

  7. Integrate with real routes: /resend-otp and /verify-otp

  8. Why keys include IP and a prefix

  9. EXPIRE vs EXPIREAT (and why we use EXPIRE here)

  10. Why Redis over MongoDB/Postgres for rate limiting

  11. Redis data types you’ll touch in this guide

  12. Bonus: API response caching (/todos) with TTL

  13. Cheat sheet: pick Fixed or Sliding?


What is Redis? Why use it for rate limiting?

Redis is an in‑memory data store that’s extremely fast (microseconds). It supports data structures like Strings, Sets, Sorted Sets, Hashes, Lists, Geospatial etc.

  • Super fast (microseconds latency) → perfect for caching, sessions, and rate limiting.

  • Works as key–value storage.

  • Supports many data types:

For rate limiting we need:

  • Low latency counters per user/IP/route

  • Fast expiry/cleanup

  • Atomic increments and sorted time windows

Redis nails all three, which is why it’s the industry standard for rate limiting, sessions, queues, and caching.


When & where to apply rate limiting (real examples)

Place a rate limiter as Express middleware in front of routes that are sensitive to abuse:

  • Auth flows: /login, /verify-otp, /resend-otp

  • Public APIs: /search, /contact, /comments

  • Costly endpoints: data export, AI calls, payment retries

Real example we’ll implement:

  • /resend-otp: 2 requests per 5 minutes per IP

  • /verify-otp: 5 wrong attempts per 15 minutes per email+IP


Project setup & Redis connection (Cloud + Docker)

1) Install dependencies

npm i express  validator bcrypt jsonwebtoken redis

If you’re behind a proxy/load balancer (e.g., Nginx/Render/Vercel), set app.set('trust proxy', 1) so req.ip is correct.

project-root/
├─ .env.example
├─ package.json
├─ index.js
├─ src/
│  ├─ config/
│  │  ├─ database.js
│  │  └─ redis.js
│  ├─ middleware/
│  │  ├─ rateLimiterFixed.js
│  │  └─ rateLimiterSliding.js
│  ├─ routes/
│  │  ├─ auth.js
│  │  └─ todos.js
│  ├─ models/
│  │  └─ User.js
│  └─ utils/
│     └─ sendMail.js

2) Connect to Redis (Cloud/Managed — no Docker)

// config/redis.js
const redis = require("redis");

const redisClient = redis.createClient({
  username: 'default',                                  // usually 'default'
  password: process.env.REDIS_PASSWORD,                 // from your cloud provider
  socket: {
    host: process.env.REDIS_HOST,                       // e.g. 'redis-123.c123.ap-south-1-2.ec2.cloud.redislabs.com'
    port: Number(process.env.REDIS_PORT),               // e.g. 12345
  }
});

redisClient.on('error', (err) => console.error('Redis error:', err));
redisClient.on('connect', () => console.log('Redis connected'));

module.exports = redisClient;
//index.js
require("dotenv").config();
const express = require("express");
const app = express();
const cookieParser = require("cookie-parser");
const cors = require("cors");
app.use(express.json());
app.use(cookieParser());
app.use(cors());
const redisClient = require("./src/config/redis");
const connectDb = require("./src/config/database");
const initializeConnection = async () => {
  try {
    // await redisClient.connect();
    // console.log("Redis connected");

    // await connectDb();
    // console.log("MongoDB connected");
    await Promise.all([redisClient.connect(), connectDb()]);
    console.log("Redis and MongoDB connected");

    app.listen(PORT, () => {
      console.log(`Server is running on port ${PORT}`);
    });
  } catch (error) {
    console.error("Failed to initialize:", error);
  }
};

initializeConnection();

3. How to Run and Connect Redis Using Docker

Docker is a platform to run applications inside containers.

  • Container → A lightweight, isolated environment that has its own OS libraries, binaries, and dependencies.

  • Image → A blueprint for creating containers (like a class vs an object in programming).

Why use Docker for Redis?

  • No need to install Redis manually → just run it in a container.

  • Can remove/reset Redis easily without affecting your local machine.

  • Same environment for development, staging, and production.


Installing Redis with Docker (Best Practice → Redis Stack)

Redis Stack = Redis server + extra modules (Search, JSON, Graph, etc.) + GUI (RedisInsight) in one image.

Run the command:

docker run -d \
  --name redis-stack \
  -p 6379:6379 \
  -p 8001:8001 \
  redis/redis-stack:latest

Breaking it down:

  • docker run → Start a new container.

  • -d → Detached mode (runs in background like a service).

  • --name redis-stack → Give your container a readable name.

  • -p 6379:6379 → Map local port 6379 → container port 6379 (Redis server).

  • -p 8001:8001 → Map local port 8001 → container port 8001 (RedisInsight GUI).

  • redis/redis-stack:latest → Image name and version (latest).


What happens when you run it?

  1. Docker Daemon (background service) checks if you already have the redis/redis-stack image.

  2. If not found → pulls it from Docker Hub.

  3. Creates a container from that image.

  4. Starts Redis server (port 6379) and RedisInsight GUI (port 8001).


Check if Redis is running

docker ps

Example output:

CONTAINER ID   IMAGE                      PORTS
abc12345       redis/redis-stack:latest   0.0.0.0:6379->6379, 0.0.0.0:8001->8001

Connecting to Redis CLI inside Docker

docker exec -it redis-stack redis-cli

Breakdown:

  • docker exec → Run a command inside a running container.

  • -it → Two flags together:

    • -i (interactive) → Keeps STDIN open to type commands.

    • -t (tty) → Allocates a terminal session for nice formatting.

  • redis-stack → Name (or ID) of the running container.

  • redis-cli → Redis command-line client.

Test it:

127.0.0.1:6379> PING
PONG

If you see PONG → Redis is working.


Using RedisInsight GUI

  1. Open → http://localhost:8001

  2. Add a connection:

  3. Now you can:

    • View all keys (resend-otp:IP, verify-otp:email:IP)

    • Run commands without CLI

    • Visualize sorted sets for sliding window rate limiting.


Fixed Window rate limiter (simple, fast)

Idea: Allow N requests per windowSeconds. We keep a simple counter that resets when the window expires.

  • Pros: Very fast, minimal Redis work (just INCR + EXPIRE)

  • Cons: Burstiness at boundaries (a user can “double dip” across windows)

// middlewares/rateLimiterFixed.js
const redisClient = require("../config/redis");

function rateLimiterFixed({ keyPrefix, maxRequest, windowSeconds }) {
  return async (req, res, next) => {
    try {
      const ip = req.ip; 
      const key = `${keyPrefix}:${ip}`;

      // INCR creates the key if it doesn't exist
      const count = await redisClient.incr(key);

      // Set TTL only the first time we see this key in the current window
      if (count === 1) {
        await redisClient.expire(key, windowSeconds);
      }

      console.log(`${key}${count} requests in current fixed window`);

      if (count > maxRequest) {
        return res.status(429).json({
          success: false,
          message: `${keyPrefix} route → Too many requests. Try again after ${windowSeconds} seconds.`,
        });
      }

      next();
    } catch (err) {
      console.error("Fixed window error:", err);
      res.status(500).json({ success: false, message: "Server Error" });
    }
  };
}

module.exports = rateLimiterFixed;

Why if (count === 1) expire(...)?
We only set the TTL(time to live) when the window starts. Every subsequent INCR stays within that same window until TTL hits 0 and Redis deletes the key.

Boundary burst example (why fixed window can be unfair):

  • Window = 10 minutes, max = 10.

  • User sends 10 requests at 09:09:5809:10:00, then 10 more at 09:10:01.

  • Total 20 requests across ~3 seconds, yet both fall into different windows → allowed.
    If that’s a problem , use Sliding Window.


Sliding Window rate limiter (fair, production-friendly)

Idea: Store timestamps of requests in a Sorted Set (ZSET). Before each new request:

  1. Trim all entries older than currentTime - windowSeconds.

  2. Count how many remain.

  3. If count ≥ maxRequest, block; else add the current timestamp and continue.

  • Pros: Fair, smooth over window boundaries

  • Cons: Slightly more Redis work (ZADD, ZREMRANGEBYSCORE, ZCARD/ZCOUNT)

// middlewares/rateLimiterSliding.js
const redisClient = require("../config/redis");

function rateLimiterSliding({ keyPrefix, maxRequest, windowSeconds }) {
  return async (req, res, next) => {
    try {
      const ip = req.ip;
      const key = `${keyPrefix}:${ip}`;

      const currentTime = Date.now();                  
      const windowStartTime = currentTime - windowSeconds * 1000;

      // 1) Remove entries older than our sliding window
      await redisClient.zRemRangeByScore(key, 0, windowStartTime);

      // 2) Count how many requests are still in the window
      const numberOfRequests = await redisClient.zCard(key);
        //alternative
      // const numberOfRequests = await redisClient.zCount(key, windowStartTime, currentTime);
      console.log(`${key}${numberOfRequests} requests in current sliding window`);
      if (numberOfRequests >= maxRequest) {
        return res.status(429).json({
          success: false,
          message: `${keyPrefix} route → Too many requests. Cooldown ${windowSeconds}s.`,
        });
      }
      // 3) Add current request (score = timestamp)
      await redisClient.zAdd(key, [
        { score: currentTime, value: `${currentTime}-${Math.random()}` },
      ]);
      // 4) memory clean up
      await redisClient.expire(key, windowSeconds);

      next();
    } catch (err) {
      console.error("Sliding window error:", err);
      res.status(500).json({ success: false, message: "Server Error" });
    }
  };
}

module.exports = rateLimiterSliding;

Line-by-line: every Redis Z* command explained

Here’s the breakdown for Sliding Window:

const currentTime = Date.now();
const windowStartTime = currentTime - windowSeconds * 1000;
  • Current time in milliseconds. We use ms because it gives better precision for our scores.

  • windowStartTime is used for Compute the left edge of the window. Anything earlier than this is outside our allowed window. basically starting boundary of moving window.

await redisClient.zRemRangeByScore(key, 0, windowStartTime);
  • Trim: Remove all enteries before windowStartTIme otherwise memory increase.
const numberOfRequests = await redisClient.zCard(key);
// OR: const numberOfRequests = await redisClient.zCount(key, windowStartTime, currentTime);
  • ZCARD key returns the number of items in the sorted set.

    • ZCARD is after delete total items = current window count.
  • ZCOUNT key min max returns how many items have scores between min and max.

    • If you don’t trim, use ZCOUNT(key, windowStartTime, currentTime).

    • Important: ZCOUNT(key, currentTime, currentTime) would count only items with score == currentTime, which is not what we want.

if (numberOfRequests >= maxRequest) {
  return res.status(429).json({message:"Too many Request"); 
}
  • If we already hit the cap, block this request.
await redisClient.zAdd(key, [
  { score: currentTime, value: `${currentTime}-${Math.random()}` },
]);
  • Add this request to the window with score = currentTime.

  • Why Sorted Set (ZSET)? Because we need ordering by time and range operations by score. ZSET makes “remove everything before time X” and “count everything after time X” super cheap.

  • Why array? The node‑redis client supports adding multiple members at once; API expects an array.

  • Why value: <timestamp-random>? Members in a ZSET must be unique. If two requests occur in the same millisecond, Math.random() keeps values unique.

await redisClient.expire(key, windowSeconds);
  • Ensure the key auto-expires if a user stops sending requests (keeps memory tidy).

Integrate with real routes: /resend-otp and /verify-otp

We’ll show Sliding Window for fairness (recommended), but you can swap with Fixed.

// server.js (or app.js)
const express = require("express");
const validator = require("validator");
const bcrypt = require("bcrypt");
const redisClient = require("./config/redis");
const rateLimiterSliding = require("./middlewares/rateLimiterSliding");
const rateLimiterFixed = require("./middlewares/rateLimiterFixed");
const User = require("./models/User"); // your Mongoose model
const sendMail = require("./utils/Emails"); // your mail sender

const app = express();
app.use(express.json());
app.set('trust proxy', 1); // Important if behind proxy for correct req.ip

1) /resend-otp — at most 2 requests / 5 minutes per IP

app.post(
  "/resend-otp",
  rateLimiterSliding({ keyPrefix: "resend-otp", maxRequest: 2, windowSeconds: 300 }),
  async (req, res) => {
    try {
      const { email } = req.body;
      if (!validator.isEmail(email)) {
        return res.status(400).json({ success: false, message: "Invalid email" });
      }

      const user = await User.findOne({ email });
      if (!user) {
        return res.status(404).json({ success: false, message: "User not found" });
      }

      // If existing OTP still valid, don't resend
      if (user.otp && user.otpExpiry && Date.now() < user.otpExpiry.getTime()) {
        return res.status(400).json({
          success: false,
          message: "OTP already sent and still valid",
        });
      }

      // Generate new OTP
      const otpPlain = Math.floor(100000 + Math.random() * 900000).toString();
      const otpHash = await bcrypt.hash(otpPlain, 10);
      user.otp = otpHash;
      user.otpExpiry = new Date(Date.now() + 5 * 60 * 1000); // 5 minutes
      await user.save();

      await sendMail(
        user.email,
        "Your New OTP",
        `<p>Your new OTP is <b>${otpPlain}</b>. It will expire in 5 minutes.</p>`
      );

      res.status(200).json({ success: true, message: "New OTP sent to your email" });
    } catch (err) {
      console.error(err);
      res.status(500).json({ success: false, message: "Server error" });
    }
  }
);

2) /verify-otp — at most 5 wrong attempts / 15 minutes per email + IP

Here we keep an attempts key tied to (email + ip) and only increment on wrong attempts:

app.post(
  "/verify-otp",
  rateLimiterSliding({ keyPrefix: "verify-otp", maxRequest: 20, windowSeconds: 900 }), // optional broader limiter
  async (req, res) => {
    try {
      const { email, otp } = req.body;

      if (!validator.isEmail(email)) {
        return res.status(400).json({ success: false, message: "Invalid email" });
      }

      const user = await User.findOne({ email });
      if (!user) {
        return res.status(404).json({ success: false, message: "User not found" });
      }

      if (!user.otp || !user.otpExpiry) {
        return res.status(400).json({ success: false, message: "No OTP found" });
      }

      if (Date.now() > user.otpExpiry.getTime()) {
        return res.status(400).json({ success: false, message: "OTP expired" });
      }

      const attemptsKey = `otp_attempts:${email}:${req.ip}`;
      const isMatch = await bcrypt.compare(otp, user.otp);

      if (!isMatch) {
        // Increment wrong-attempts counter with a 15-min TTL
        const attempts = (await redisClient.incr(attemptsKey)) || 0;
        if (attempts === 1) {
          await redisClient.expire(attemptsKey, 900); // 15 minutes
        }

        if (attempts >= 5) {
          return res.status(429).json({
            success: false,
            message: "Too many wrong OTP attempts. Try later.",
          });
        }

        return res.status(400).json({ success: false, message: "Invalid OTP" });
      }

      // OTP correct: reset attempts, consume OTP
      await redisClient.del(attemptsKey);
      user.otp = undefined;
      user.otpExpiry = undefined;
      await user.save();

      res.status(200).json({ success: true, message: "OTP verified" });
    } catch (err) {
      console.error(err);
      res.status(500).json({ success: false, message: "Server error" });
    }
  }
);

Why rate limit here? OTP endpoints are a prime target for brute force.
We used email+IP to stop one IP from hammering many accounts.


Why keys include IP and a prefix

We build keys like:

  • resend-otp:203.0.113.7

  • verify-otp:alice@example.com:203.0.113.7

Why:

  • Prefix (route name) keeps per-route isolation.

  • IP (and/or userId, apiKey) keeps per-actor fairness.

  • Keys read cleanly in Redis Insight.
    Behind proxies, ensure app.set('trust proxy', 1) so req.ip is correct.


EXPIRE vs EXPIREAT (and why we use EXPIRE here)

  • EXPIRE key seconds — relative TTL. “Delete this key in N seconds.”

  • EXPIREAT key unixTimeSeconds — absolute timestamp. “Delete this key at T.”

We use EXPIRE because our windows are relative: “start now, last windowSeconds”.


Why Redis over MongoDB/Postgres for rate limiting

  • Latency: Redis runs in memory, Mongo/Postgres hit disk (even with caches).

  • Atomic counters: INCR, ZADD, ZCOUNT etc. are single network roundtrips and very fast.

  • TTL & eviction: Redis can auto-expire keys, keeping memory tidy.

  • Throughput: Rate limiting is often hot path; Redis handles huge QPS(Query Per Seconds).

Use Mongo/Postgres for persistent data (users, orders, invoices). Use Redis for transient, high-speed tasks (rate limiting, caching, queues). Redis stores data in memory; MongoDB stores on disk (hardware), which is slower and not ideal for per-request throttling.


Bonus: API response caching (/todos) with TTL

Let’s cache a “free API” response for 30 seconds. First request hits the API (407ms). Subsequent requests served from Redis (7–15ms).

// routes/todos.js
const express = require("express");
const fetch = require("node-fetch"); // or global fetch in Node 18+
const redisClient = require("../config/redis");

const router = express.Router();

router.get("/todos", async (req, res) => {
  const cacheKey = "todos";

  try {
    // 1) Try cache
    const cached = await redisClient.get(cacheKey);
    if (cached) {
      console.log("Cache HIT");
      return res.status(200).json(JSON.parse(cached));
    }

    console.log("Cache Miss");
    // 2) Fetch
    const resp = await fetch("https://jsonplaceholder.typicode.com/todos");
    const data = await resp.json();

    // 3) Store in cache with TTL (30s)
    await redisClient.set(cacheKey, JSON.stringify(data), { EX: 30 });

    res.status(200).json(data);
  } catch (err) {
    console.error("Todos error:", err);
    res.status(500).json({ success: false, message: "Server Error" });
  }
});

module.exports = router;

Observed times (example):

  • 1st call: ~407ms (MISS, goes to source API)

  • Next calls within 30s: ~7–15ms (HIT, served from Redis)

  • After TTL expires, first call is slow again, then fast.


Cheat sheet: pick Fixed or Sliding?

  • Fixed Window

    • Use when: sheer simplicity and speed are desired; small side projects; low risk of boundary bursts.

    • Code: INCR + EXPIRE.

  • Sliding Window (recommended for auth/OTP)

    • Use when: fairness matters; avoid boundary bursts; smoother user experience.

    • Code: ZADD + ZREMRANGEBYSCORE + ZCARD (or ZCOUNT) + EXPIRE.


Complete, copy‑paste friendly middlewares

Fixed:

// middlewares/rateLimiterFixed.js
const redisClient = require("../config/redis");

function rateLimiterFixed({ keyPrefix, maxRequest, windowSeconds }) {
  return async (req, res, next) => {
    try {
      const ip = req.ip;
      const key = `${keyPrefix}:${ip}`;

      const count = await redisClient.incr(key);
      if (count === 1) {
        await redisClient.expire(key, windowSeconds);
      }

      if (count > maxRequest) {
        return res.status(429).json({
          success: false,
          message: `${keyPrefix} route → Too many requests. Try again after ${windowSeconds} seconds.`,
        });
      }

      next();
    } catch (err) {
      console.error("Fixed window error:", err);
      res.status(500).json({ success: false, message: "Server Error" });
    }
  };
}

module.exports = rateLimiterFixed;

Sliding:

// middlewares/rateLimiterSliding.js
const redisClient = require("../config/redis");

function rateLimiterSliding({ keyPrefix, maxRequest, windowSeconds }) {
  return async (req, res, next) => {
    try {
      const ip = req.ip;
      const key = `${keyPrefix}:${ip}`;

      const now = Date.now();
      const windowStart = now - windowSeconds * 1000;

      await redisClient.zRemRangeByScore(key, 0, windowStart);

      // Use either approach:
      const count = await redisClient.zCard(key);
      // const count = await redisClient.zCount(key, windowStart, now);

      if (count >= maxRequest) {
        return res.status(429).json({
          success: false,
          message: `${keyPrefix} route → Too many requests. Cooldown ${windowSeconds}s.`,
        });
      }

      await redisClient.zAdd(key, [{ score: now, value: `${now}-${Math.random()}` }]);
      await redisClient.expire(key, windowSeconds);

      next();
    } catch (err) {
      console.error("Sliding window error:", err);
      res.status(500).json({ success: false, message: "Server Error" });
    }
  };
}

module.exports = rateLimiterSliding;

Final tips & gotchas

  • Trust proxy for correct req.ip: app.set('trust proxy', 1).

  • ZSET is perfect for time windows because scores are timestamps and Redis gives you fast range ops.

  • Use Redis Insight to watch keys/TTL live while testing. It’s super helpful.


That’s it!

You now have:

  • A clear understanding of why and where to rate limit

  • Two battle‑tested middlewares (Fixed & Sliding window)

  • A handy caching example for speeding up APIs

2
Subscribe to my newsletter

Read articles from Harsh Sharma directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harsh Sharma
Harsh Sharma

Hi 👋 I'm a Full Stack Developer passionate about building scalable, and cloud-ready applications. I work with the MERN stack (MongoDB, Express, React, Node.js) and Python to craft robust frontend and backend systems. I'm experienced with cloud technologies like AWS (EC2, S3, Lambda) and containerization using Docker. I also love integrating Generative AI (OpenAI, LLMs) into applications and working on real-time features using WebSockets and Apache Kafka. My expertise lies in delivering high-performance, full-stack solutions with clean code, solid architecture, and efficient DevOps practices. Open to freelance or full-time opportunities as a Full Stack Developer!