draft-polli-ratelimit-headers Describe use in throttled requests

I expect

A description on how to use RateLimit-* in throttled responses, eg 429

example

An user over quota, and receives an eg 429.

Returning all the headers may be redundant, as

RateLimit-Remaining == 0
RateLimit-Reset == Retry-After

Not returning the triple may make things complex for client

GET /foo

HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 1
RateLimit-Remaining: 0
RateLimit-Reset: 58
Retry-After: 58

Questions and proposals

Q1- "the server MAY return all three headers?" or the server SHOULD return all three headers ?

Q2 - "Retry-After MUST be equal to RateLimit-Remaining"

Jul 17 '19 14:07 ioggstream

A server MAY return `RateLimit` response header fields independently
of the response status code.

If a response contains both the `Retry-After` and the `RateLimit-Reset` header fields,
their values MUST be consistent.

Aug 04 '19 17:08 ioggstream

Consistent as in "equal"?

Aug 04 '19 18:08 whiskeysierra

While I'd like to say equal, I don't want to restrict retry-after implementations. If you implemented already retry-after with the httpdate syntax you can adopt ratelimit-reset in Delta-seconds notation without having to modify your existing code.

This is because I consider retry-after "higher in rank" respect to ratelimit.

We could rephrase such that ratelimit-reset value MUST be consistent with retry-after. This will help implementors concentrating on the new functionalities respect to changing the existing ones.

Let me know and thanks, R

Aug 05 '19 08:08 ioggstream

While I'd like to say equal, I don't want to restrict retry-after implementations. If you implemented already retry-after with the httpdate syntax you can adopt ratelimit-reset in Delta-seconds notation without having to modify your existing code.

That refers to equality of representation (both seconds vs seconds and httpdate). Do we define consistency in terms of values? Meaning their values should be equal (seconds = httpdate - now) or could one of them being bigger than the other and still be considered consistent?

Aug 05 '19 08:08 whiskeysierra

I'd say consistent means referencing the same moment in time, eg.

Date: Mon, 05 Aug 2019 09:27:00 GMT
Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
RateLimit-Reset: 5

If you have a rationale for proposing RateLimit-Reset referencing a moment <= or >= to Retry-After we can consider that!

Aug 05 '19 09:08 ioggstream

(sorry for reopening, perhaps we can continue discussion elsewhere)

Adding some feedback related to this matter.

In section 4 we can read:

If a response contains both the Retry-After and the RateLimit-Reset header fields, the value of RateLimit-Reset MUST be consistent with the one of Retry-After. [...] Under certain conditions, a server MAY artificially lower RateLimit field values between subsequent requests, eg. to respond to Denial of Service attacks or in case of resource saturation.

The DoS example would imply that if a Retry-After is also present then it should also take into account the value of RateLimit-Reset if RateLimit-Remaining is 0, as that reset time could have been incremented to deal with the resource saturation (assuming the option to increment RateLimit-Reset artificially is acceptable in the text).

In section 5 we deal with what happens when both headers are found together:

If a response contains both the RateLimit-Reset and Retry-After header fields, the Retry-After header field MUST take precedence and the RateLimit-Reset header field MAY be ignored.

So my understanding is that if both must be consistent and both are added to a response, and if take consistency as roughly referring to the same point in time, they should agree and it wouldn't matter which one would take precedence. Unless consistency would mean one is never sooner than the other. This consistency relation should be defined in more exact terms.

Example 6.1.4 reinforces the notion of consistency meaning "same moment":

A client exhausted its quota and the server throttles the request sending the Retry-After response header field. The values of Retry-After and RateLimit-Reset are consistent as they reference the same moment.

Unfortunately this does not define the consistent relation, and helps reinforce the notion they must refer to the same point in time when the remaining quota is 0.

Appendix D (FAQ), sections 5 and 7 also mention Retry-After

Point 5 mentions using code 429 + Retry-After for similar contexts as these new RateLimit headers, and point 7 mentions using code 503 + Retry-After for resource saturation, mentioning resources and/or metrics closely related to infrastructure. It reads:

Dynamically lowering the values returned by the rate-limit headers, and returning retry-after along with them can improve availability.

So any transient resource saturation can be dealt with by modifying the rate limiting headers as well as the Retry-After one in a similar fashion. It isn't clear from this text whether that would cover adding more time to RateLimit-Reset or not, but if so then Retry-After being "consistent" might mean pointing it to the same moment as RateLimit-Reset if there is no remaining quota (which is a piece of information also open to be lowered).

So my proposal is to define in clear terms what this "consistency" between Retry-After and RateLimit-Reset is and when/how to use each header.

In my understanding, we should always obey Retry-After at a minimum, and the information contained in the rate limiting headers should be purely informational until the time specified by Retry-After is reached.

That is, if both Retry-After and RateLimit-Reset are present in a response, then the client must honor Retry-After, and only after such time has elapsed, consider the rate limiting header. In this case, there is no need to define any consistency relation between both: just obey Retry-After.

However, after the time specified in Retry-After has elapsed, we might use the information contained in the rate limiting headers to approximate our traffic to how the limits apply. With the information in the rate limiting headers intact (ie. not modified to deal with a saturation of infrastructure resources), the agent can check whether the reset time has elapsed, and:

If the reset time has elapsed, it can assume its quota has been restored and perform a new request.
If the reset time has not elapsed, it can choose to use the remaining quota or, if no quota is left, wait until the reset time.

This sorts out the problems arising from dynamic, transient issues with infrastructure without losing the information in the rate limiting headers, and allows the agent to know what their approximate quota is regardless of temporary circumstances.

In this proposal, Retry-After is taken as something more than just a hint, meaning "don't perform a request because it won't be served", whereas RateLimit-Reset coupled with a RateLimit-Remaining of 0 would be a hint that a new request is very likely to be rejected. There is no necessary relation between both.

That said, if we wanted to raise the hint of the combination of a RateLimit-Reset value with a RateLimit-Remaining quota of 0, then we could define the consistency as requiring that Retry-After should always be the same moment as RateLimit-Reset or later, in other words, any Retry-After in such situation should never point to some time before RateLimit-Reset.

I don't think there is any other meaningful relation in terms of a quota higher than 0.

Even then, I think we can clearly separate Retry-After from the case of RateLimi-Reset + RateLimit-Remaining: 0 by forcing agents to honor Retry-After and take the rate limit reset time as a strong hint their requests won't be serviced until that time, yet they would be free to try again as the headers just provide a (strong) hint rather than a fixed requirement.

Upgrading them from hints to requirements would then render Retry-After earlier than the reset time with a 0 quota non-sense.

A Retry-After with a point in time after the combination of RateLimit-Reset with RateLimit-Remaining: 0 makes sense, because that would mean some of the issues Retry-After deals with (I want to think more in terms of infrastructure, scaling, service resources issues) are present and the rate limiting policy is only temporarily overridden, all the while providing some information in the (unmodified) headers that could be useful after the Retry-After time elapses.

Thoughts? @ioggstream

Nov 04 '19 16:11 unleashed

Hi @unleashed, agree that consistent is not clearly defined. We should tackle it.

RateLimit-Reset and Retry-After

there is no need to define any consistency relation between both: just obey Retry-After.

Agreed: RateLimit-Reset is a quota information, while Retry-After is a server statement.

My original idea of consistency was same moment in time, but it's fine to KISS and remove the consistency concept (which is complex and probably does not add that much to the spec).

Dynamic limits

It isn't clear from this text whether that would cover adding more time to RateLimit-Reset or not

It's up to the server to decide how to modify headers and communicate limits. This spec should just define the header names and their semantics.

Dynamically lowering the values returned by the rate-limit headers, and returning retry-after along with them can improve availability.

This sentence should be fixed, as it mixes two use-cases:

the first, where a server reduces limits or increase reset to tell the client to "slow down"
the second where the server tells the client to "stop"

A corner case is when the window is very low and is continuously reset (eg. 1 second): in these cases a 429 seems to me more a slow-down than a stop :) but that's corner case.

PS: did you register to IETF106? https://www.ietf.org/registration/ietf106/remotereg.py It would be great if you join the meeting for the RateLimit presentation!

Nov 04 '19 17:11 ioggstream

It's up to the server to decide how to modify headers and communicate limits. This spec should just define the header names and their semantics.

Right. My suggestion in this context is hinting at the use of Retry-After for transient conditions that are likely to be short-lived such as DoS and similar, rather than using the rate-limiting headers for that purpose in order to avoid redundancy and provide useful information usable after the condition disappears.

This sentence should be fixed, as it mixes two use-cases:

Ok, let me think about some rewording in the lines of the suggestions above and I'll submit a PR and reference this issue.

PS: did you register to IETF106? https://www.ietf.org/registration/ietf106/remotereg.py It would be great if you join the meeting for the RateLimit presentation!

Thanks for the link, I wasn't quite aware of the possibility of attending remotely. Just registered!

Nov 05 '19 09:11 unleashed

hinting at the use of Retry-After for [..] short-lived conditions such as DoS and similar, rather than using the rate-limiting headers

imho that depends whether:

the server wants to slow-down the client advertising a reduced quota
the server wants to push-back the client

We can provide some example request/responses and ask for feedback to the http community.

About IETF106 it would be great if we prepare the 10 minute talk together so that if I cannot attend for whatever reason you can take my place :)

Nov 05 '19 12:11 ioggstream

About IETF106 it would be great if we prepare the 10 minute talk together so that if I cannot attend for whatever reason you can take my place :)

Sure, will ping you over IM.

Nov 05 '19 15:11 unleashed

draft-polli-ratelimit-headers draft-polli-ratelimit-headers copied to clipboard

Describe use in throttled requests

I expect

example

Questions and proposals

RateLimit-Reset and Retry-After

Dynamic limits

draft-polli-ratelimit-headers
draft-polli-ratelimit-headers copied to clipboard