Understanding Token Bucket: A Powerful Rate Limiting Strategy

Nayan KunwarNayan Kunwar
4 min read

In the world of backend development, rate limiting is a must-have. Whether you're building an API or a microservice, rate limiting ensures that your server doesn't get overwhelmed by too many requests — either by accident or abuse. Among the various strategies, the Token Bucket algorithm stands out for its flexibility and control.

In this article, we’ll dive into what Token Bucket is, how it works, when to use it, and even how to implement it.

Why Rate Limiting?

Before jumping into Token Bucket, let’s understand why rate limiting matters:

  • Security: Protects against brute-force attacks or API abuse.

  • Fairness: Ensures no single user consumes too many resources.

  • Stability: Keeps your system stable under high traffic.

Now, let’s break down one of the most balanced approaches: Token Bucket.

What Is the Token Bucket Algorithm?

Imagine a bucket that holds tokens. Each token represents permission to perform one request.

  • The bucket has a maximum capacity (say, 100 tokens).

  • Tokens are added at a fixed rate (say, 10 per second).

  • Every time a user sends a request, they must "spend" one token.

  • If there are no tokens left, the request is either:

    • Rejected (hard limit), or

    • Delayed until a token becomes available (soft limit).

Key Concepts:

  • Burstiness: Token Bucket allows for sudden bursts of traffic as long as tokens are available.

  • Recovery: The bucket refills over time, so users can make more requests later.

  • Flexibility: Better for real-time systems than strict algorithms like Leaky Bucket.

📊 Token Bucket vs Leaky Bucket

FeatureToken BucketLeaky Bucket
Traffic BehaviorAllows burstsEnforces uniform rate
Refilling MechanismAdds tokens at intervalsRemoves requests at a rate
FlexibilityMore flexibleMore strict
Use CaseAPIs, real-time appsBandwidth shaping

How Does It Work (Example)?

Let's say:

  • Bucket capacity = 60 tokens

  • Refill rate = 1 token per second

  • Incoming user = tries to send 20 requests at once

If the bucket is full:

  • All 20 requests go through immediately (20 tokens consumed).

  • Over time, 1 token refills every second.

  • Once the bucket is empty, requests will be rejected or delayed until tokens are replenished.

This makes Token Bucket great for use cases where short bursts are acceptable but sustained abuse is not.

Implementing Token Bucket in Node.js (in-memory)

Here’s a simplified version using JavaScript:

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate; // tokens per second
    this.lastRefill = Date.now();
  }

  refill() {
    const now = Date.now();
    const secondsPassed = (now - this.lastRefill) / 1000;
    const tokensToAdd = Math.floor(secondsPassed * this.refillRate);

    if (tokensToAdd > 0) {
      this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
      this.lastRefill = now;
    }
  }

  tryRemoveToken() {
    this.refill();

    if (this.tokens > 0) {
      this.tokens -= 1;
      return true;
    }
    return false;
  }
}

// Example usage
const bucket = new TokenBucket(10, 2); // 10 tokens max, 2 tokens/sec

setInterval(() => {
  if (bucket.tryRemoveToken()) {
    console.log("Request allowed");
  } else {
    console.log("Rate limit exceeded");
  }
}, 300);

Note: For production systems, use Redis-based or distributed buckets for scaling.

When Should You Use Token Bucket?

Use Token Bucket when:

  • You want to allow bursts of traffic but control the average rate.

  • Your system needs to support temporary spikes (e.g., during user login).

  • You want more flexibility than strict throttling.

Ideal for:

  • APIs

  • Messaging queues

  • WebSocket or WebRTC signaling

  • Cloud services with quotas

Token Bucket in the Real World

Many APIs (like Stripe, GitHub, Twitter) use token-bucket-style rate limiting because:

  • It provides user-friendly flexibility.

  • Allows traffic smoothing without rejecting legitimate usage.

  • Handles both fairness and performance well.

Conclusion

The Token Bucket algorithm is one of the best tools for balancing system protection and user experience. Its flexibility makes it suitable for most backend applications — especially in Node.js, microservices, or real-time apps.

If you're building an API or a distributed service, implementing Token Bucket (even using tools like Redis) is a smart move.

Thankyou

0
Subscribe to my newsletter

Read articles from Nayan Kunwar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nayan Kunwar
Nayan Kunwar