caddy-ratelimit Multiple zones definition and processing order

Hello,

First, I would like to thank you for this amazing module !

I recently did some tests with it and I have some questions about multi zone handling. I've this configuration:

        # Rate limit
        rate_limit {
                zone target_per_ip {
                        key target_per_ip-{client_ip}-{http.request.host}
                        events 1200
                        window 1m
                }
                zone per_ip {
                        key per_ip-{client_ip}
                        events 3000
                        window 1m
                }
                zone per_target {
                       key per_target-{http.request.host}
                       events 6000
                       window 1m
                }
                log_key
        }
        reverse_proxy http://127.0.0.80:80

What I'm trying to achieve is:

Rate limit one specific IP that is constantly hammering one specific target domain (target_per_ip).
Rate limit one specific IP that is constantly hammering any target domain (per_ip)
Rate limit one specific target domain whois is constantly hammered by any source IP (per_target)

When doing some tests for exemple with ab from a remote host and some agressive parameters, the first rate limit to trigger is target_per_ip which is correct.

{"level":"info","ts":1733027294.5232427,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"target_per_ip","wait":57.565922146,"remote_ip":"94.x.97.100","key":"target_per_ip-94.x.97.100-www.target.ch"}
{"level":"info","ts":1733027294.52412,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"target_per_ip","wait":57.565031375,"remote_ip":"94.x.97.100","key":"target_per_ip-94.x.97.100-www.target.ch"}
{"level":"info","ts":1733027294.525319,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"target_per_ip","wait":57.563836569,"remote_ip":"94.x.97.100","key":"target_per_ip-94.x.97.100-www.target.ch"}

Then after a few seconds it triggers the second per_ip limit which is a bit more tolerant

{"level":"info","ts":1733027182.089967,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_ip","wait":53.771214735,"remote_ip":"94.x.97.100","key":"per_ip-94.x.97.100"}
{"level":"info","ts":1733027182.090872,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_ip","wait":53.770307777,"remote_ip":"94.x.97.100","key":"per_ip-94.x.97.100"}
{"level":"info","ts":1733027182.0909524,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_ip","wait":53.770228556,"remote_ip":"94.x.97.100","key":"per_ip-94.x.97.100"}

And finally it triggers the third limit per_target which is even more tolerant

{"level":"info","ts":1733027183.5868707,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_target","wait":52.27425604,"remote_ip":"94.x.97.100","key":"per_t
arget-www.target.ch"}
{"level":"info","ts":1733027183.587857,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_target","wait":52.273271302,"remote_ip":"94.x.97.100","key":"per_t
arget-www.target.ch"}
{"level":"info","ts":1733027183.5879197,"logger":"http.handlers.rate_limit","msg":"rate limit exceeded","zone":"per_target","wait":52.273207694,"remote_ip":"94.x97.100","key":"per_
target-www.target.ch"}

My question is; as soon as the request hits one of the limit, in this example per_ip and is returned with a 429, the other zones counters seems to increase even if the request is denied with a 429.

In the end the "attacker" blocked with 429 in previous zones is able to trigger a 429 returnet to anyone else trying to visit the target domain even it's own request are 429'ed.

Is there some kind of zone ordering, or some way to define that a request already returned with a 429 because it exceed a zone doesn't increase the other zones counters ? I'm not sure if my question is correctly asked :)

Side question: I'm using {client_ip} in the placeholder because it should be populated by real client IP from trusted proxies (for example Cloudflare IP ranges). Is this a right way to do it ? Will the trusted proxies replace client_ip with the real IP before rate limit is checked ?

We define them this way (with modules to automatically get cloudflare IP ranges)

        servers {
                protocols h1 h2
                trusted_proxies cloudflare {
                        interval 12h
                        timeout 15s
                }
        }

Thanks a lot for your time reading and the feedback !

Kind regards

Dec 02 '24 04:12 sriccio

So to clarify, do you mean that when a request exceeds the per_ip rate limit, it also counts against the more permissive per_target rate limit?

Dec 03 '24 13:12 mholt

Hello @mholt, thanks for the reply.

Yes, it's what I am experiencing. I was not expecting a request being blocked by one of the zone still being counted as a hit in the other zones.

It's probably by design and I am doing it wrong to achieve what I try to do.

It seems to me that an attacker being rate limited by the per_ip zone and getting returned with a 429 http code still increases the other per_target zone even if the request is not served.

Kind regards

Dec 03 '24 15:12 sriccio

Hmm, that is odd if that's really what's happening. While iterating the zones, we return immediately at the first one that is exceeded. They are sorted by order of how restrictive they are (in terms of events per window).

Dec 03 '24 16:12 mholt

I can confirm the same behavior during my tests.

Aug 19 '25 19:08 samrg472

Looking at the code provided, the issue is that the limiter is the order is incorrect when testing zones. I have provided an example Caddyfile. I added debug logging to confirm what was happening for each rate limit test. I expect the ordering to be per_header first, which is more restrictive.

2025/08/19 20:34:14.544 INFO    http.handlers.rate_limit        Testing zone    {"zone": "global", "permissiveness": 0.00000000016666666666666666}
2025/08/19 20:34:14.544 INFO    http.handlers.rate_limit        Testing zone    {"zone": "per_header", "permissiveness": 0.00000000008333333333333333}

http://localhost:8080 {
  rate_limit {
    zone per_header {
      match {
        path /
      }
      key     {http.request.header.test}
      events  5
      window  60s
    }

    zone global {
      match {
        path /
      }
      key     static
      events  10
      window  60s
    }

    log_key
  }

  respond "Hello world!"
}

Aug 19 '25 20:08 samrg472

If I lower the window in per_header (i.e. 20 seconds), I get the ordering I expect for the zone tests. In this particular case, I would rather be able to disable automatic ordering since it can lead to unexpected behavior.

I wrote a test that can demonstrate this behavior:

func TestPermissivenessOrdering(t *testing.T) {
	maxEvents := 10
	// Admin API must be exposed on port 2999 to match what caddytest.Tester does
	config := fmt.Sprintf(`{
	"admin": {"listen": "localhost:2999"},
	"apps": {
		"http": {
			"servers": {
				"demo": {
					"listen": [":8080"],
					"routes": [{
						"handle": [
							{
								"handler": "rate_limit",
								"rate_limits": {
									"zone1": {
										"match": [{"method": ["GET"]}],
										"key": "{http.request.orig_uri.path}",
										"window": "60s",
										"max_events": %d
									},
									"zone2": {
										"match": [{"method": ["GET"]}],
										"key": "static",
										"window": "60s",
										"max_events": %d
									}
								},
								"log_key": true
							},
							{
								"handler": "static_response",
								"status_code": 200
							}
						]
					}]
				}
			}
		}
	}
}`, maxEvents, maxEvents*2)

	initTime()

	tester := caddytest.NewTester(t)
	tester.InitServer(config, "json")

	// Rate limits for different zones (by method) and keys (by request path)
	// should be accounted independently
	for i := 0; i < maxEvents; i++ {
		tester.AssertGetResponse("http://localhost:8080/permissive1", 200, "")
	}
	tester.AssertGetResponse("http://localhost:8080/permissive1", 429, "")

	for i := 0; i < maxEvents; i++ {
		tester.AssertGetResponse("http://localhost:8080/permissive2", 200, "")
	}
	tester.AssertGetResponse("http://localhost:8080/permissive2", 429, "")

	// Check to ensure that the more permissive zone is rate limited.
	tester.AssertGetResponse("http://localhost:8080/permissive3", 429, "")
}

As it stands, this test will fail even though zone1 is more restrictive than zone2. Try changing the zone1 window to 1s and the test will pass.

Aug 19 '25 21:08 samrg472