API Rate Limiter - Theory
In this article, our focus will be on the API rate limiter—a crucial element within an API Gateway. To put it simply, a rate limiter restricts the number of API invocations per time unit for individual users, IPs, or APIs. Its role is to prevent abuse, promote fair usage, and safeguard the server against potential overloads stemming from either malicious or unintentional excessive requests.
Algorithms
There are multiple algorithms available for rate limiter. Here is a small list of algorithms widely used.
Leaky Bucket Algorithm
Token Bucket Algorithm
Fixed window counter
Sliding window log
Exponential backoff
Weighted distributed rate limiting
In this article, our primary focus will be on the Token Bucket algorithm. In essence, each user or IP is associated with a 'token bucket' filled with a predetermined number of tokens. Each API call consumes one token from the bucket. When the tokens are depleted, further requests are throttled until the bucket is refilled. Periodically, a service runs at predetermined intervals to replenish the tokens in the bucket. This process ensures that users have a fresh set of tokens available at the end of each interval, allowing for continued API usage.
Challenges
This algorithm mainly has 2 challenges.
A separate service needs to run on the system which could refill the token after a fixed interval. This will be an additional cost on the system.
Concurrency: If multiple requests are received from the same user/IP, there can be race condition.
Solution
To reduce the cost of the extra service, we can use something called lazy refilling. We will completely take off the extra service part and refill the respective bucket only if a particular user tries to access a resource. We will use a TTL index in Redis to keep track of time. Here is a pseudo-code, which will actually make things clearer.
//pseudocode for handling lazy loading of refill
if(user.hasToken) {
user.token--;
return true;
} else {
if(refil key is present) {
return false;
} else {
user.token = tokenLimit-1;
user.refillKey = true;
user.refillKey.setTTL(intervalLimit);
return true;
}
}
In addressing the concurrency issue, we have encapsulated the entire code responsible for token generation and updation within a Lua script. The execution of Lua scripts in Redis is atomic, ensuring that the values are consistently read and modified without the risk of race conditions. Importantly, we refrain from isolating read operations from the Lua script, as every read operation is coupled with a mandatory update. This approach further guarantees the integrity of the token-related operations in a concurrent environment.
Conclusion
In this article we have just looked into the theory part of the rate limiter. In the next article we will look into how to implement a rate limiter. We will be using nodejs and redis to do the implementation.
Subscribe to my newsletter
Read articles from Somesh Mohan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by