Why Rate Limiting is Crucial in Backend Systems

Ratelimiting is one of the important steps in the development of any software. Sensitive routes, which are related to Authentication, are heavily rate-limited for better security and to avoid problems related to spamming

Here are some other reasons why we should implement rate limiting.

  • Prevent Overload: usually, every endpoint in a backend has a basic rate limiting. This is done so that the load on the servers can be reduced, thus making the services available and responsive for all users.

  • Preventing Abuse: Most of the time, popular websites go under attack when attackers try to guess passwords and OTPs (One-Time Passwords) by a brute-force approach. They send requests in bulk to the server to guess passwords and one-time passwords (OTPs). By limiting how often we allow the user to send passwords or OTPs to these endpoints, we reduce the chances of being hacked.

  • Managing Traffic: In high-traffic websites like ticket booking websites for shows, concerts and etc. Due to high demand, there is a quick spike in the backend requests. In this scenario, rate limiting the endpoints helps in reducing the load and gives fair chances to every person for booking tickets.

  • DDoS Protection: A Distributed Denial of Service (DDoS) attacks the website is overwhelmed with requests from multiple sources, which cause the servers to choke up, though rate-limiting might not be very helpful for most of times in DDoS attack because the server is getting requests from multiple IP which looks legit, in these case IP bases rate-limiting(most basic and effective form of rate-limiting strategy) would fail. In this case, usually other methods are implemented to overcome these problems, like

    • Application-Layer Gateway (ALG): It's a network device that analyzes packets at the application layer to detect attack patterns. For example, HTTP flood attacks can be detected using ALG.

    • Web Application Firewall (WAF): A WAF sits between your web applications and internet-facing firewalls. It filters traffic based on predefined rules or custom rules, providing real-time protection for web applications.

    • Cloudflare's DDoS Protection: Cloudflare offers a suite of DDoS mitigation services that are built to automatically scale and adapt in response to the rapidly evolving nature of cyber threats. These include rate limiting, IP blocking, and more.

    • Intrusion detection systems (IDS) and Intrusion prevention systems (IPS): They can detect malicious traffic at multiple layers, including the network layer, using deep packet inspection.

Ratelimiting can be performed on the basis of IP address, User ID, Geo location, and Device fingerprint. Here is a quick example about how to add ratelimiting to a backend.

Let's set up a basic Rate-Limiting for auth endpoints. We will be looking into how to implement this in Express.js + Node.js Backend.

Setting up Rate-Limiting in Express.js

We can add rate-limiting by writing a custom middleware. First, we install express-rate-limit . Here is the implementation

import rateLimiter from 'express-rate-limit';

const rateLimiterOptions = {
    windowMs: 5 * 60 * 1000, // 5 min
    max: 5, // Limiting each IP to make 5 request at max, in very 5 minutes
    message: 'Too many request, please try again after sometime',
    standardHeaders: true,
    legacyHeaders: false
}

app.post('/reset-password-otp', rateLimiter(rateLimiterOptions), (req, res) => {
    const otp = req.body.otp;
    if(!otp) {
        res.status(400).json({message: 'OTP not found'});
    }
    /* Logic for reset-passoword via OTPs */
    res.status(200).json({message: `${otp} verified successfully`});
});

Let’s discuss what these rateLimiterOptions do:

  • windowMs: This is the duration, in milliseconds, that we consider as a 'bucket'. In your case, it's set to 5 minutes (300 seconds or 60000 milliseconds). So, for example, if a user makes his first request, then the rate limiter will start counting the time since he made last request him. If during this window, another request comes in by the same IP address, it will not be counted because within the windowMs duration, only 5 requests are allowed.

  • max: This option sets the maximum number of requests that can be made to your Express app from a single client IP before further requests are blocked. In this case, it's set to 5. So if an IP address makes more than 5 requests in any given windowMs period (in this case, 5 minutes), subsequent requests will receive the response message you specified: 'Too many requests, please try again after some time'.

  • standardHeaders: If this option is set to true (as in our case), then the rate limit info will be added to the RateLimit-* response headers that are sent back with each response. This can be useful for debugging and understanding how the rate limiting works. But remember, exposing too much data may lead to security issues, so use this wisely.

  • legacyHeaders: If set to false (as in our case), then express-rate-limit will not addX-RateLimit-* headers which are usually considered legacy and not very useful for client-side rate limiting.

gist: When a new request comes in from an IP address, the middleware checks if there are any available 'slots' within that time window. If slots are available (i.e., fewer than max requests have been made within the last windowMs milliseconds), the request is processed and a slot is consumed.

If no slots are available, i.e., more than max requests have been made in the past windowMs time, the request is rejected with an appropriate response (like your case where it says 'Too many requests, please try again later'). This means that within any given windowMs period, no more than max requests are allowed from a single IP address.

4
Subscribe to my newsletter

Read articles from Abhinab Choudhury directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abhinab Choudhury
Abhinab Choudhury