gloo icon indicating copy to clipboard operation
gloo copied to clipboard

ratelimit ability to not count ratelimited requests for specific descriptors

Open huzlak opened this issue 1 year ago • 8 comments

Gloo Edge Product

Enterprise

Gloo Edge Version

v1.14.7

Is your feature request related to a problem? Please describe.

Customer would like to define 2 levels of ratelimits, first based on org and package, the second based on org and integrator. For example limit of 5 requests/min for Integrator and 10 requests/min for Org. Currently if Integrator1 does 8 requests he gets 429 after request 5 for the last 3 requests. If in the same minute Integrator 2 tries to do 5 requests he starts getting 429 after 2 requests as also the ratelimited requests are counted for the package level ratelimit.

Example ratelimitconfigs:

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 10
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package
        value: basic
---
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl2
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 5
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-int
      - key: odyssey-package
        value: basic

Describe the solution you'd like

A flag for specific descriptor that would tell that requests ratelimited based on this descriptor should not increase any counters.

Describe alternatives you've considered

Using envoy ratelimit API instead of the set-style with nested descriptors and weight on the integrator descriptor:

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - actions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    - actions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
    descriptors:
    - key: odyssey-org
      descriptors:
      - rateLimit:
          requestsPerUnit: 10
          unit: MINUTE
        key: odyssey-package
        value: basic
        descriptors:
        - rateLimit:
            requestsPerUnit: 5
            unit: MINUTE
          weight: 1
          key: odyssey-int

This does not server the purpose as any number of integrators can do 5 requests regardless of their package as only counter for the descriptor with non-zero weight is increased.

Additional Context

No response

huzlak avatar Sep 20 '23 10:09 huzlak

zf-rl drawio

@sam-heilbron @nfuden updated diagram. Thanks @huzlak

SantoDE avatar Nov 20 '23 14:11 SantoDE

Whats the use case for not being able to define both the rate limits in the same config? https://docs.solo.io/gloo-edge/latest/guides/security/rate_limiting/set/#rule-priority

nfuden avatar Dec 15 '23 13:12 nfuden

With regular setup in one config if first(integrator) rule matches, the other(organization) rule counter is not incremented. It should be incremented, but only for the requests that were not ratelimited by integrator's limit as these should count in whole organization limit because requests to upstream happened. The ones ratelimited by integrator limit should not be counted as they were ratelimited already and did not go to the upstream. If we would apply alwaysApply then the organization rule would be always applied even if the first was too and also for ratelimited requests. Customer needs a way to not count requests ratelimited by integrator config. As all the headers come from extauth stage, they can not split into early and regular ratelimitconfigs.

Adding example test results for better clarification:

Without AlwaysApply:
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
200 200 200 200 200 429 429 429

With AlwaysApply:

$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 429 429 429 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
429 429 429 429 429 429 429 429

Used ratelimit config:

spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 5
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package
      - key: odyssey-int
    - alwaysApply: true
      rateLimit:
        name: group_limit
        requestsPerUnit: 10
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package

Desired behavior:

$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
429 429 429 429 429 429 429 429

huzlak avatar Dec 18 '23 09:12 huzlak

So, if we look at the application policy for RL, we currently have "alwaysApply=true" and "alwaysApply=false". Now, say we have RL1 and RL2 in one config (pseudo code):

rateLimit:
  setDescriptors:
  - RL1-Integrator: 5 requests per minute
  - RL2-Organization: 10 requests per minute

The current behaviour is that either 1 rule is applied (alwaysApply=false) where the integrators are only rate-limited per integrator, and never as a group of integrators belonging to the same organization. Here every integrator has its own RL1 bucket. So that means, even if you want to rate limit on organization, every integrator will always be able to do 5 requests per minute, even if they, as a group, should be blocked at 10.

And the other option (alwaysApply=true) means that both rules always apply. And that means that a request will always be counted in both buckets, RL1 and RL2, even if RL1 has already rate-limited the request. So, if Integrator-1 starts sending hundreds of requests per second, they will be counted in both in Integrator-1's own RL1 bucket, but ALSO in the RL2 bucket for the organization. And this means that all of the other Integrators will be blocked as well by RL2.

So as I see it, the ask is to add an additional mode, say applyIfNotRateLimited=true (I'm very bad with names, and yes, this is a double negation, so you might want to come up with something better), of which the semantics should be that the rule should ONLY be applied if the request is not already rate-limited by a previous rule. And this should prevent requests already rate-limited by RL1 to also be counted in the RL2 bucket.

Something like this:

rateLimit:
  setDescriptors:
  - RL1-Integrator: 5 requests per minute
  - RL2-Organization: 10 requests per minute
     applyIfRateLimited=false

You could argue that it would make sense to introduce something like a mode property that could take a value out of set/enumeration of different modes: "alwaysApply", "applyIfNotRateLimited", instead of introducing another property. Argument for this is that "alwaysApply" and "applyIfNotRateLimited" cannot/should not be configured at the same time.

DuncanDoyle avatar Jan 25 '24 12:01 DuncanDoyle

Do we forsee the applyIfNotRateLimited use case ever being extended to a larger nested tree?

For example we say we have a place where you have the following rate limits

  1. Org total safety bucket, ie the dont blow up the service
  2. Team fairness bucket, ie dont let one sub team hog all the usage of a service within the org
  3. Sub group fairness, ie the Team itself may have 2 services and want to make sure that one of their 2 services never entirely consumes their team fairness bucket.

For example

  1. ORG: 100 requests per minute
  2. TEAM: 50 requests per minute pulled from header:team
  3. SubGroup: 40 requests per minute pulled from header:subgroup

nfuden avatar May 13 '24 12:05 nfuden

Hi @nfuden @huzlak any updates when this will be picked up we (ZF) have been waiting for this for a very long time..

akhilaj avatar Aug 12 '24 12:08 akhilaj