redpanda
redpanda copied to clipboard
Rate limiting per client groups
Adding rate limiting per client group. Now it is possible to unite clients into groups so clients under one group will have common rate quota. Group name is any string. Client is part of the group if its client_id prefix is equal to group name. For Groups we create separate rate limiters. Before we were creating limiter for every client
I have tested it manually in ducktape. But I didn't find any approach to implement stable ducktape test because we have separate quotas on each shard, we can't calculate delays for requests (when exceeds quota)
Potential problem: Every shard has its own quota
Backports Required
- [x] none - not a bug fix
- [ ] none - issue does not exist in previous branches
- [ ] none - papercut/not impactful enough to backport
- [ ] v22.3.x
- [ ] v22.2.x
- [ ] v22.1.x
Release Notes
Features
Clients can be united in one group in order to have common rate quota.
Do we need some test?
I see that groups are defined as a prefix to the client I'd string. I was wondering if there was a better way to do this. I know that with the flex additions to the Kafka protocol metadata can be shipped along with any Kafka struct, they are called tags.
There is support for tags within the request header too, if we could ensure that clients send the group ID within the tags metadata struct then we could avoid parsing the client id for a group id altogether.
https://github.com/redpanda-data/redpanda/blob/dev/src/v/kafka/server/protocol_utils.cc#L89
The only negative of this is that all clients must be making requests at supported APIs that are new enough to support flex.
discuss: Is the client group TP limiting going to be applied to the response/fetch traffic?
/ci-repeat 10 skip-units dt-repeat=100 tests/rptest/tests/cluster_quota_test.py::ClusterRateQuotaTest
/ci-repeat 10 skip-units dt-repeat=100 tests/rptest/tests/cluster_quota_test.py::ClusterRateQuotaTest
Failure was k8s