ratelimit icon indicating copy to clipboard operation
ratelimit copied to clipboard

Feature request: Stop incrementing cache keys when overall status is over limited

Open renuka-fernando opened this issue 2 years ago • 2 comments

Scenario

Let's say we have grouped routes

  1. Group: Foo
    1. /foo/v1/hello-1
  2. Group: Bar
    1. /bar/v1/hello-1
    2. /bar/v1/hello-special

And we need to apply the following rate limits.

  1. Per group -> 10 req/min
  2. Special rate limit for the route /bar/v1/hello-special in group Bar -> 3 req/min

Let's say the client invokes /bar/v1/hello-special 3 times, the remaining quota for /bar/v1/hello-1 should be 7 since both routes are belonged to group Bar. We can achieve this by adding two actions in the route /bar/v1/hello-special as follows.

Proxy yaml

- match:
    prefix: /foo/v1/hello-1
  route:
    cluster: mock
    rate_limits:
      - actions:
          - generic_key:
              descriptor_key: "group"
              descriptor_value: "foo"


- match:
    prefix: /bar/v1/hello-1
  route:
    cluster: mock
    rate_limits:
      - actions:
          - generic_key:
              descriptor_key: "group"
              descriptor_value: "bar"
- match:
    prefix: /bar/v1/hello-special
  route:
    cluster: mock
    rate_limits:
      - actions:
          - generic_key:
              descriptor_key: "group"
              descriptor_value: "bar"
      - actions:
          - generic_key:
              descriptor_key: "special-policy"
              descriptor_value: "3PerMin"

Rate Limit Service Config Yaml

descriptors:
  - key: group
    rate_limit:
      unit: minute
      requests_per_unit: 10
  - key: special-policy
    value: 3PerMin
    rate_limit:
      unit: minute
      requests_per_unit: 3

Current Issue

Let's say the client invokes /bar/v1/hello-special 8 times within 1min. So it gets

  • 200 OK - 3 times
  • 429 Too Many Requests - 5 times

Then invokes /bar/v1/hello-1 15 times in the same minute.

  • 200 OK - 2 times (/bar/v1/hello-special consumed 8 requests but there were only 3 success requests)
  • 429 Too Many Requests - 13 times

The total 200 OK requests for the group Bar is 5 even though the defined rate limit is 10 req/min.

Reason

The reason is we are always incrementing the Redis cache even though the request is over-limited. Enabling the local cache will stop increasing the key special-policy_3PerMin (which is actually over-limited) but keep increasing the group_bar (this is not over-limited, but the overall status is over-limited).

Proposal

Introduce a setting in the rate limit service to stop increasing Redis keys when the overall code is over the limit.

STOP_INCREASING_KEYS_FOR_OVER_LIMIT=true

renuka-fernando avatar Jul 26 '22 12:07 renuka-fernando

Hey @mattklein123 We might have this requirement when there are common cache keys among routes right? wdyt?

If I summarize the issue: we have two routes foo and bar with following descriptors

  • /foo
    (key1, val1) -> 10 req/min
    
  • /bar
    (key1, val1) -> 10 req/min
    (key1, val1), (key2, val2) -> 3 req/min
    

If client invokes /bar 10 times in one min (3 HTTP/200 + 7 HTTP/429) it can't invoke /foo with success, because it it already rate limitted. If we check actual calls to the upstream, it is just 3 calls to the upstream.

renuka-fernando avatar Jul 27 '22 10:07 renuka-fernando

Hey @ysawa0, Any idea on this? Your comments are really appreciated.

Since the cache key (Redis/Memcached) is already incremented before checking the limit is over, as a solution that comes to my mind, we can check the local cache (if enabled) and stop incrementing the cache. Otherwise, we can decrement the cache key again if the overall statis is over-limited. Since that could be overhead, we can introduce a setting. This overhead will be further reduced if local caching is enabled.

renuka-fernando avatar Jul 28 '22 10:07 renuka-fernando

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Aug 27 '22 12:08 github-actions[bot]

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

github-actions[bot] avatar Sep 03 '22 16:09 github-actions[bot]

Hi @mattklein123,

Do you have any updates on this issue?

It would be really appreciated if we could fix this issue since this is a considerable use case when applying rate limits for a group of routes.

Thank you

chashikajw avatar Jun 01 '23 00:06 chashikajw