tyk
tyk copied to clipboard
optimize rate limit using Lua script
Description
I see rate limiting as an important feature that should be implemented as a separate function instead of using a shared function (SetRollingWindow, GetRollingWindow) with other functionalities, which can result in resource waste So I write func RunScript using Lua script for rate limit
Related Issue
https://github.com/TykTechnologies/tyk/issues/5920
Motivation and Context
I have the same issue with this Rate limit don't work under high QPS
How This Has Been Tested
I have built a sample API to compare the rate limit before and after the improvement. With a per value of 300, rate value of 60, and a server firing a load with 5 worker pools and 20k requests:
With the original rate limit, it takes 60 seconds to complete. The CPU usage increases by 60%, and the sorted set reaches 2MB with nearly 20k keys.
With the improved rate limit, it only takes 6 seconds to complete. The CPU usage peaks at 6%, and the sorted set size is reduced to 1KB with 60 keys.
Screenshots (if appropriate)
Origin rate limit
Lua script rate limit
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Refactoring or add test (improvements in base code or adds test coverage to functionality)
Checklist
- [ ] I ensured that the documentation is up to date
- [ ] I explained why this PR updates go.mod in detail with reasoning why it's required
- [ ] I would like a code coverage CI quality gate exception and have explained why
Apply Sweep Rules to your PR?
- [ ] Apply: All new business logic should have corresponding unit tests.
- [ ] Apply: Refactor large functions to be more modular.
- [ ] Apply: Add docstrings to all functions and file headers.
It's unfortunate that a sliding log implementation is in use here, but recent versions allow you to consider using a fixed window rate limiter behaviour and this performance/resource usage penalty for sliding log is now clearly stated in the docs; Another behaviour to consider is spike arrest, which blocks traffic according to an ongoing rate (requests over the rate will count against the rate and block), and rate limit smoothing (ability to increase limits to allow more traffic gradually).
https://tyk.io/docs/getting-started/key-concepts/rate-limiting/#fixed-window-rate-limiter https://tyk.io/docs/getting-started/key-concepts/rate-limiting/#rate-limit-smoothing
Per endpoint rate limits are scheduled for the next release.
Unfortunately your PR cannot be accepted, mostly due to this feature drift on our side, as well as how you approached the change. Our storage implementation is a particular area that's not suited for rate limits, as it's focused on providing an interface for several other processes. If you check into internal/rate or in general anywhere around rate limits today, you may find several positive changes during this time that try to address your root issue. I'm happy to revisit any feedback you may have against >=5.4 state (5.5 getting released soon).
Particularly sliding log is used to provide a buffered access-log like functionality in the analytics area and are mainly keeping it around for compatibility at this point.