envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Why no retries for requests enforced by the local ratelimiter?

Open HeTvaM opened this issue 3 years ago • 0 comments

Description: I have a situation where I need to retry requests if the request to the upstream host has been enforced by a local rate limiter.

I use the following HTTP filter:

http_filters:
- name: envoy.filters.http.local_ratelimit
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
    stat_prefix: http_local_rate_limiter

I configure a local rate limiter:

            routes:
            - match:
                prefix: "/"
              route:
                prefix_rewrite: "/"
                timeout: 15s
                cluster: app
                retry_policy:
                  retry_on: envoy-ratelimited
                  num_retries: 10
              typed_per_filter_config:
                envoy.filters.http.local_ratelimit:
                  "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
                  stat_prefix: app_ratelimit
                  token_bucket:
                    max_tokens: 5
                    tokens_per_fill: 5
                    fill_interval: 5s
                  filter_enabled:
                    runtime_key: local_rate_limit_enabled
                    default_value:
                      numerator: 100
                      denominator: HUNDRED
                  filter_enforced:
                    runtime_key: local_rate_limit_enforced
                    default_value:
                      numerator: 100
                      denominator: HUNDRED

In the documentation for retry_policy, you can add retry_on: envoy-ratelimited field and there will be a retry response that is limited by the local ratelimiter if the header x-envoy-ratelimited is contained. The x-envoy-ratelimited is added only if the disable_x_envoy_ratelimiter_headers field is not set to true. I could not find a description of this field for local ratelimiter in the documentation. There is enabled_x_ratelimiter_header - by default it is disabled, but its inclusion did not affect retry in any way.

So I tried to add it with response_headers_to_add::

              routes:
              - match:
                  prefix: "/"
                route:
                  host_rewrite_literal: app
                  prefix_rewrite: "/"
                  timeout: 15s
                  cluster: app
                  retry_policy:
                    retry_on: envoy-ratelimited
                    num_retries: 10
                typed_per_filter_config:
                  envoy.filters.http.local_ratelimit:
                    "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
                    stat_prefix: app_ratelimit
                    enable_x_ratelimit_headers: DRAFT_VERSION_03
                    token_bucket:
                      max_tokens: 5
                      tokens_per_fill: 5
                      fill_interval: 5s
                    filter_enabled:
                      runtime_key: local_rate_limit_enabled
                      default_value:
                        numerator: 100
                        denominator: HUNDRED
                    filter_enforced:
                      runtime_key: local_rate_limit_enforced
                      default_value:
                        numerator: 100
                        denominator: HUNDRED
                    response_headers_to_add:
                      append: false
                      headers:
                        key: x-envoy-ratelimited
                        value: 'true'

I got header x-envoy-ratelimited, but it had no effect either.

I test locally with siege, I get 429 response codes, but unfortunately I didn't see in admin/stats that the cluster.<cluster_name>_upstream_rq_retry field increases or differs from 0.

In the console, if I run many requests at once, I get almost all 429 response codes. That is, no retry happens. Снимок экрана от 2022-08-08 12-30-51

I've already read through the documentation, but I can't find the reason or any explanation. What am I doing wrong?

HeTvaM avatar Aug 08 '22 09:08 HeTvaM