rolling-rate-limiter Have a question on classdojo engineering article on rate limiting approaches

Hi,

I tried reaching you via twitter or any other platform but can't find you. Hence, creating an issue here.

https://engineering.classdojo.com/blog/2015/02/06/rolling-rate-limiter

I was going through your article on classdojo engineering. It was written 7 years ago.

Each user has a two keys associated with them: the token bucket, and a timestamp of the last time the bucket was refilled.

When a user attempts to perform an action, we fetch the stored timestamp.

We calculate how many tokens the user should have been granted since that last request.

We can then proceed with the algorithm, using this new token count.

Quoting this from the blog, the extension to the token bucket algorithm. I did not understand "We calculate how many tokens the user should have been granted since the last request. We can then proceed with the algorithm, using this new token count.". How do we calculate from just the timestamp? and what is the new token count?

Can you elaborate more on this?

What my understanding is, in rate-limiting, we have two constraints one is the time interval (T) and the number of requests (N).

According to your approach, each user has two keys the tokens and the timestamp. Something like this

{
"alex": {
  "timestamp": <epoch_timestamp>,
  "tokens": <tokens>
}
}

When the user requested action. we do the following.

Get the last requested timestamp.
If curr_timestamp - last_requested_timestamp > T where T is the time interval a. reset tokens to max_tokens b. update timestamp to curr_timestamp
Update tokens to tokens-1 (as the user is requested an action now)
Update timestamp to curr_timestamp

May 27 '22 10:05 lokesh1729

Hi, thanks for reaching out. It's been awhile since I wrote that article, so I'm still paging some context back into my head.

How do we calculate from just the timestamp? and what is the new token count? Can you elaborate more on this?

Suppose the rate-limiting rule is "1 request per second". If you look up that the last timestamp was T, and it's now time T + 30, then the user can be credited with 30 tokens, capped at some max number.

May 27 '22 16:05 peterkhayes

Hello, Peter, the blog you posted 7 years ago is referenced in a famous book about system design. And I want to learn different algos for the rate limiter. I understand you are busy, it would be appreciated if you could further explain why token bucket doesn't work. I think only one read + write to redis is required for each request, and it's unnecessary to have a scheduled refiller that resets the bucket.

I am thinking about the similar algo as Lokesh's, and I also want to handle minDifference.

1. Get the last requested timestamp from Redis using the user id as the key. The value is a json, it contains a timestamp and a token size. Both are numbers. 
2 if the record is not found in redis, not limited
   a. set token size to max_tokens - 1
   b. set timestamp to curr_timestamp
3 If curr_timestamp - last_requested_timestamp > maxInInterval, not limited
   a. reset tokens to max_tokens - 1
   b. update timestamp to curr_timestamp
4 if maxInInterval > curr_timestamp - last_requested_timestamp > minDifference and tokens are available, not limited
   a. Update tokens to tokens - 1 
   b. Update timestamp to curr_timestamp
5 if curr_timestamp - last_requested_timestamp < minDifference, limited, a reject action is an action
   a. Update tokens to tokens - 1 
   b. Update timestamp to curr_timestamp
6 if curr_timestamp - last_requested_timestamp > minDifference but tokens are not available, limited
   a. No update on redis

https://github.com/peterkhayes/rolling-rate-limiter/blob/d2dea273393a7a10126a62e880f12a663db3568a/src/index.test.ts#L246-L304

The above is a good test case, where requests are made at 0, 4, 5, 8, 17 and the policy is { interval: 10, maxInInterval: 3, minDifference: 2 }

Using the token bucket, at time 0, the user doesn't have a record in redis, so a new record is saved. the request is not limited.

{
   timestamp: 0,
   tokens: 2
}

The tokens are 2 because max is 3, and 1 is used at 0.

at time 4, the user continues to use the token, update the record, and the request is not limited.

{
   timestamp: 4,
   tokens: 1
}

at time 5, the user continues to use the token, however, it breaks minDifference, it's limited. A reject action is also an action.

{
   timestamp: 5,
   tokens: 0
}

at time 8, no tokens left, it's limited. And, it doesn't break minDifference. do nothing on redis.

{
   timestamp: 5,
   tokens: 0
}

at time 11, no tokens left, it's limited. And, it doesn't break minDifference. do nothing on redis.

{
   timestamp: 5,
   tokens: 0
}

at time 17, 17 - 5 > 10, reset the bucket. It's not limited.

{
   timestamp: 17,
   tokens: 2
}

Jul 25 '22 03:07 YiyangLi

Sorry for the delayed response.

In your suggested algorithm, it looks like you're (sometimes) making two requests to Redis: one to read, and (sometimes) one to write. In step 1, you fetch data from Redis. Then, in application code, you execute some logic, and based on the results, set some data in Redis.

This is not safe to race conditions. As I wrote in the blog post:

Unfortunately, this algorithm also has a problem: it fails when two processes need to share the rate limiter. Redis can batch operations into one atomic action, but to calculate how many tokens we need to give the user, we need at least two trips to redis: one to get the last timestamp, and one to set the new token count. Without delving into Redis Lua scripting (something none of us had experience with, and that would be hard to mock out in tests), we couldn’t find a way to combine all needed operations into a single atomic set of Redis actions. Because of this, if two clients using the rate limiter both tried to verify a user action at the same time, we could have the following sequence of operations:

The user has enough tokens left for one action.

Not enough time has passed since the last action for more tokens to be granted.

Client 1 gets the stored timestamp and token count.

Client 2 gets the stored timestamp and token count.

Client 1 calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.

Client 2 also calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.

Needless to say, this is not ideal. If we have several dozen workers all processing push notification payloads, it would be possible for a user to be spammed with dozens of pushes all at once.

In short, you need to make sure that you only make one, transactional request to Redis, with no application logic in the middle.

Sep 26 '22 23:09 peterkhayes

I think the word "atomic" was not the property that guaranteed the linearizability, but rather the "isolated" nature of the Redis atomic commits. "Atomic" just means abortable, but "isolated" actually means non-interfering execution and prevention of those race conditions.

Reference

This means that Redis transactions have a true serializable isolation level, and also means that no rollback mechanism is required to implement WATCH.

Nov 24 '22 01:11 Zackhardtoname

The consensus seems to be that there is no race condition even with the first simpler approach. Could you please kindly review the discussion here?

Nov 24 '22 02:11 Zackhardtoname

To clarify: the race condition I'm worried about is not about two clients executing their commands in Redis at the same time. I know that's not possible. Instead, it's about two clients each making two requests to Redis, where the first client's second request comes after the second client's first request. That's why everything has to happen in a single batch; that lets Redis's single-threaded nature work for us. This problem could probably be solved with locking as well, but that's not the approach I took in this library.

Feb 14 '23 06:02 peterkhayes

If we can temporarily allow the count to negative, it's possible to use a single operation instead two redis commands when making a request to redis.

"Suppose there are two requests coming at the same time, and we only had one token in the bucket. Each Rate Limiter node calls Redis cache with the decr operation. Since Redis computation is single-threaded, one of them will get the response of 0, and the other will receive -1. Therefore, only one of the requests is accepted"

Feb 14 '23 06:02 Zackhardtoname

rolling-rate-limiter rolling-rate-limiter copied to clipboard

Have a question on classdojo engineering article on rate limiting approaches

rolling-rate-limiter
rolling-rate-limiter copied to clipboard