spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

built-in redis rate limiter script has room for performance tuning

Open iis-MarkKuang opened this issue 11 months ago • 0 comments

built-in redis rate limiter script has room for performance tuning

Describe the bug The builtin request_rate_limiter.lua script is used for rate limiting, implementing Token Bucket Algorithm, the script's main logic is ok, but has some flaws in performance due to some details.

  1. The logics of assigning values to variables last_tokens, last_refreshed are a little redundant, take assigning to last_tokens for example, 1.1. first by trying to get value via redis.get and assign it to last_tokens 1.2. then judging if it's nil, then assign the value of capacity to last_tokens as a fallback

    local last_tokens = tonumber(redis.call("get", tokens_key))
    if last_tokens == nil then
      last_tokens = capacity
    end
    

    but according to load testing and luau compiler results, doing so could be inefficient since the whole chunk could be replaced by

    local last_tokens = tonumber(redis.call("get", tokens_key)) or capacity
    

    As you can see in the compiled code of the chunk in request_rate_limiter.lua below (here I replaced redis.get with a simple to_number call since I ran luau in local lua env, where redis was not configured and the point is that in compiled code, there is redundancy compared to the refined version), we have a line L0: JUMPXEQKNIL R0 L1 NOT which judges if the variable in R0 register(which is last_tokens) is Nil, if so, jump to label L1 where it's assigned the value of capacity, which is the line LOADN R0 10000

    Function 0 (??):
    REMARK builtin tonumber/1
        2: local last_tokens = tonumber("100000")
    LOADK R1 K0 ['100000']
    FASTCALL1 62 R1 L0
    GETIMPORT R0 2 [tonumber]
    CALL R0 1 1
        3: if last_tokens == nil then
    L0: JUMPXEQKNIL R0 L1 NOT
        4:   last_tokens = capacity
    LOADN R0 10000
        6:
    L1: RETURN R0 0
    

    While in the compiled code of the refined chunk below, by using or we only produce an ORK operation, which means we check the value of R1 register, if it's Nil, then assign the value of K0 to R1, and then assign the value of R1 to R0 and return.

    Function 0 (??):
    REMARK builtin tonumber/1
        2: local last_tokens = tonumber("100000") or capacity
    LOADK R2 K1 ['100000']
    FASTCALL1 62 R2 L0
    GETIMPORT R1 3 [tonumber]
    CALL R1 1 1
    L0: ORK R0 R1 K0 [10000]
        3:
    RETURN R0 0
    
  2. In request_rate_limiter.lua, a local variable allowed_num indicating whether the current request is allowed is unnecessarily created, and the assignment of local variable new_tokens can also be similarly optimized.

    local allowed = filled_tokens >= requested
    local new_tokens = filled_tokens
    local allowed_num = 0
    if allowed then
      new_tokens = filled_tokens - requested
      allowed_num = 1
    end
    
    ...
    return { allowed_num, new_tokens }
    

    while the whole logic can also be replaced by simplified and..or clause

    local allowed = filled_tokens >= requested
    local new_tokens = allowed and filled_tokens - requested or filled_tokens
    ...
    
    return { allowed and 1 or 0, new_tokens }
    

To sum up, the full optimized request_rate_limiter.lua is as follows:

redis.replicate_commands()

local tokens_key = KEYS[1]
local timestamp_key = KEYS[2]

local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3]) or redis.call('TIME')[1]
local requested = tonumber(ARGV[4])

local fill_time = capacity / rate
local ttl = math.floor(fill_time * 2)

local last_tokens = tonumber(redis.call("get", tokens_key)) or capacity
local last_refreshed = tonumber(redis.call("get", timestamp_key)) or 0

local delta = math.max(0, now-last_refreshed)
local filled_tokens = math.min(capacity, last_tokens+(delta*rate))
local allowed = filled_tokens >= requested
local new_tokens = allowed and filled_tokens - requested or filled_tokens

if ttl > 0 then
  redis.call("setex", tokens_key, ttl, new_tokens)
  redis.call("setex", timestamp_key, ttl, now)
end

return { allowed and 1 or 0, new_tokens }

And the revised script passes all test cases in RedisRateLimiterLuaScriptTests.java

Image

Sample

The load test using JMeter in the image below, by using request_rate_limiter.lua I get max qps of 7913.2, while by changing the script to the refined version I get max qps of 8479.2

Image

Given if put to use, the execution count of the lua script can be very large, a little optimization may bring in a significant improvement in performance.

So far I have done below tests.

  1. regression testing using RedisRateLimiterLuaScriptTests.java

Image

  1. load testing using my own application. Please help check what further testing needs to be done if you choose to involve this change.

iis-MarkKuang avatar Jan 26 '25 06:01 iis-MarkKuang