built-in redis rate limiter script has room for performance tuning
built-in redis rate limiter script has room for performance tuning
Describe the bug The builtin request_rate_limiter.lua script is used for rate limiting, implementing Token Bucket Algorithm, the script's main logic is ok, but has some flaws in performance due to some details.
-
The logics of assigning values to variables
last_tokens, last_refreshedare a little redundant, take assigning tolast_tokensfor example, 1.1. first by trying to get value viaredis.getand assign it tolast_tokens1.2. then judging if it's nil, then assign the value of capacity tolast_tokensas a fallbacklocal last_tokens = tonumber(redis.call("get", tokens_key)) if last_tokens == nil then last_tokens = capacity endbut according to load testing and luau compiler results, doing so could be inefficient since the whole chunk could be replaced by
local last_tokens = tonumber(redis.call("get", tokens_key)) or capacityAs you can see in the compiled code of the chunk in request_rate_limiter.lua below (here I replaced redis.get with a simple
to_numbercall since I ran luau in local lua env, where redis was not configured and the point is that in compiled code, there is redundancy compared to the refined version), we have a lineL0: JUMPXEQKNIL R0 L1 NOTwhich judges if the variable inR0register(which islast_tokens) isNil, if so, jump to label L1 where it's assigned the value ofcapacity, which is the lineLOADN R0 10000Function 0 (??): REMARK builtin tonumber/1 2: local last_tokens = tonumber("100000") LOADK R1 K0 ['100000'] FASTCALL1 62 R1 L0 GETIMPORT R0 2 [tonumber] CALL R0 1 1 3: if last_tokens == nil then L0: JUMPXEQKNIL R0 L1 NOT 4: last_tokens = capacity LOADN R0 10000 6: L1: RETURN R0 0While in the compiled code of the refined chunk below, by using
orwe only produce anORKoperation, which means we check the value ofR1register, if it'sNil, then assign the value ofK0toR1, and then assign the value of R1 to R0 and return.Function 0 (??): REMARK builtin tonumber/1 2: local last_tokens = tonumber("100000") or capacity LOADK R2 K1 ['100000'] FASTCALL1 62 R2 L0 GETIMPORT R1 3 [tonumber] CALL R1 1 1 L0: ORK R0 R1 K0 [10000] 3: RETURN R0 0 -
In request_rate_limiter.lua, a local variable
allowed_numindicating whether the current request is allowed is unnecessarily created, and the assignment of local variablenew_tokenscan also be similarly optimized.local allowed = filled_tokens >= requested local new_tokens = filled_tokens local allowed_num = 0 if allowed then new_tokens = filled_tokens - requested allowed_num = 1 end ... return { allowed_num, new_tokens }while the whole logic can also be replaced by simplified and..or clause
local allowed = filled_tokens >= requested local new_tokens = allowed and filled_tokens - requested or filled_tokens ... return { allowed and 1 or 0, new_tokens }
To sum up, the full optimized request_rate_limiter.lua is as follows:
redis.replicate_commands()
local tokens_key = KEYS[1]
local timestamp_key = KEYS[2]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3]) or redis.call('TIME')[1]
local requested = tonumber(ARGV[4])
local fill_time = capacity / rate
local ttl = math.floor(fill_time * 2)
local last_tokens = tonumber(redis.call("get", tokens_key)) or capacity
local last_refreshed = tonumber(redis.call("get", timestamp_key)) or 0
local delta = math.max(0, now-last_refreshed)
local filled_tokens = math.min(capacity, last_tokens+(delta*rate))
local allowed = filled_tokens >= requested
local new_tokens = allowed and filled_tokens - requested or filled_tokens
if ttl > 0 then
redis.call("setex", tokens_key, ttl, new_tokens)
redis.call("setex", timestamp_key, ttl, now)
end
return { allowed and 1 or 0, new_tokens }
And the revised script passes all test cases in RedisRateLimiterLuaScriptTests.java
Sample
The load test using JMeter in the image below, by using request_rate_limiter.lua I get max qps of 7913.2, while by changing the script to the refined version I get max qps of 8479.2
Given if put to use, the execution count of the lua script can be very large, a little optimization may bring in a significant improvement in performance.
So far I have done below tests.
- regression testing using RedisRateLimiterLuaScriptTests.java
- load testing using my own application. Please help check what further testing needs to be done if you choose to involve this change.