draft-polli-ratelimit-headers
draft-polli-ratelimit-headers copied to clipboard
Describe use in throttled requests
I expect
A description on how to use RateLimit-* in throttled responses, eg 429
example
An user over quota, and receives an eg 429.
Returning all the headers may be redundant, as
- RateLimit-Remaining == 0
- RateLimit-Reset == Retry-After
Not returning the triple may make things complex for client
GET /foo
HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 1
RateLimit-Remaining: 0
RateLimit-Reset: 58
Retry-After: 58
Questions and proposals
Q1- "the server MAY return all three headers?" or the server SHOULD return all three headers
?
Q2 - "Retry-After MUST be equal to RateLimit-Remaining"
A server MAY return `RateLimit` response header fields independently
of the response status code.
If a response contains both the `Retry-After` and the `RateLimit-Reset` header fields,
their values MUST be consistent.
Consistent as in "equal"?
While I'd like to say equal, I don't want to restrict retry-after implementations. If you implemented already retry-after with the httpdate syntax you can adopt ratelimit-reset in Delta-seconds notation without having to modify your existing code.
This is because I consider retry-after "higher in rank" respect to ratelimit.
We could rephrase such that ratelimit-reset value MUST be consistent with retry-after. This will help implementors concentrating on the new functionalities respect to changing the existing ones.
Let me know and thanks, R
While I'd like to say equal, I don't want to restrict retry-after implementations. If you implemented already retry-after with the httpdate syntax you can adopt ratelimit-reset in Delta-seconds notation without having to modify your existing code.
That refers to equality of representation (both seconds vs seconds and httpdate). Do we define consistency in terms of values? Meaning their values should be equal (seconds = httpdate - now) or could one of them being bigger than the other and still be considered consistent?
I'd say consistent means referencing the same moment in time, eg.
Date: Mon, 05 Aug 2019 09:27:00 GMT
Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
RateLimit-Reset: 5
If you have a rationale for proposing RateLimit-Reset
referencing a moment <= or >= to Retry-After
we can consider that!
(sorry for reopening, perhaps we can continue discussion elsewhere)
Adding some feedback related to this matter.
- In section 4 we can read:
If a response contains both the Retry-After and the RateLimit-Reset header fields, the value of RateLimit-Reset MUST be consistent with the one of Retry-After. [...] Under certain conditions, a server MAY artificially lower RateLimit field values between subsequent requests, eg. to respond to Denial of Service attacks or in case of resource saturation.
The DoS example would imply that if a Retry-After
is also present then it should also take into account the value of RateLimit-Reset
if RateLimit-Remaining
is 0
, as that reset time could have been incremented to deal with the resource saturation (assuming the option to increment RateLimit-Reset
artificially is acceptable in the text).
- In section 5 we deal with what happens when both headers are found together:
If a response contains both the RateLimit-Reset and Retry-After header fields, the Retry-After header field MUST take precedence and the RateLimit-Reset header field MAY be ignored.
So my understanding is that if both must be consistent and both are added to a response, and if take consistency
as roughly referring to the same point in time, they should agree and it wouldn't matter which one would take precedence. Unless consistency
would mean one is never sooner than the other. This consistency
relation should be defined in more exact terms.
- Example 6.1.4 reinforces the notion of
consistency
meaning "same moment":
A client exhausted its quota and the server throttles the request sending the Retry-After response header field. The values of Retry-After and RateLimit-Reset are consistent as they reference the same moment.
Unfortunately this does not define the consistent relation, and helps reinforce the notion they must refer to the same point in time when the remaining quota is 0
.
- Appendix D (FAQ), sections 5 and 7 also mention
Retry-After
Point 5 mentions using code 429 + Retry-After for similar contexts as these new RateLimit headers, and point 7 mentions using code 503 + Retry-After for resource saturation, mentioning resources and/or metrics closely related to infrastructure. It reads:
Dynamically lowering the values returned by the rate-limit headers, and returning retry-after along with them can improve availability.
So any transient resource saturation can be dealt with by modifying the rate limiting headers as well as the Retry-After
one in a similar fashion. It isn't clear from this text whether that would cover adding more time to RateLimit-Reset
or not, but if so then Retry-After
being "consistent" might mean pointing it to the same moment as RateLimit-Reset
if there is no remaining quota (which is a piece of information also open to be lowered).
So my proposal is to define in clear terms what this "consistency" between Retry-After
and RateLimit-Reset
is and when/how to use each header.
In my understanding, we should always obey Retry-After
at a minimum, and the information contained in the rate limiting headers should be purely informational until the time specified by Retry-After
is reached.
That is, if both Retry-After
and RateLimit-Reset
are present in a response, then the client must honor Retry-After
, and only after such time has elapsed, consider the rate limiting header. In this case, there is no need to define any consistency
relation between both: just obey Retry-After
.
However, after the time specified in Retry-After
has elapsed, we might use the information contained in the rate limiting headers to approximate our traffic to how the limits apply. With the information in the rate limiting headers intact (ie. not modified to deal with a saturation of infrastructure resources), the agent can check whether the reset time has elapsed, and:
- If the reset time has elapsed, it can assume its quota has been restored and perform a new request.
- If the reset time has not elapsed, it can choose to use the remaining quota or, if no quota is left, wait until the reset time.
This sorts out the problems arising from dynamic, transient issues with infrastructure without losing the information in the rate limiting headers, and allows the agent to know what their approximate quota is regardless of temporary circumstances.
In this proposal, Retry-After
is taken as something more than just a hint, meaning "don't perform a request because it won't be served", whereas RateLimit-Reset
coupled with a RateLimit-Remaining
of 0
would be a hint that a new request is very likely to be rejected. There is no necessary relation between both.
That said, if we wanted to raise the hint of the combination of a RateLimit-Reset
value with a RateLimit-Remaining
quota of 0
, then we could define the consistency
as requiring that Retry-After
should always be the same moment as RateLimit-Reset
or later, in other words, any Retry-After
in such situation should never point to some time before RateLimit-Reset
.
I don't think there is any other meaningful relation in terms of a quota higher than 0.
Even then, I think we can clearly separate Retry-After
from the case of RateLimi-Reset
+ RateLimit-Remaining: 0
by forcing agents to honor Retry-After
and take the rate limit reset time as a strong hint their requests won't be serviced until that time, yet they would be free to try again as the headers just provide a (strong) hint rather than a fixed requirement.
Upgrading them from hints to requirements would then render Retry-After
earlier than the reset time with a 0 quota non-sense.
A Retry-After
with a point in time after the combination of RateLimit-Reset
with RateLimit-Remaining: 0
makes sense, because that would mean some of the issues Retry-After
deals with (I want to think more in terms of infrastructure, scaling, service resources issues) are present and the rate limiting policy is only temporarily overridden, all the while providing some information in the (unmodified) headers that could be useful after the Retry-After
time elapses.
Thoughts? @ioggstream
Hi @unleashed, agree that consistent
is not clearly defined. We should tackle it.
RateLimit-Reset and Retry-After
there is no need to define any consistency relation between both: just obey Retry-After.
Agreed: RateLimit-Reset
is a quota information
, while Retry-After
is a server statement
.
My original idea of consistency was same moment in time
, but it's fine to KISS and remove the consistency concept (which is complex and probably does not add that much to the spec).
Dynamic limits
It isn't clear from this text whether that would cover adding more time to RateLimit-Reset or not
It's up to the server to decide how to modify headers and communicate limits. This spec should just define the header names and their semantics.
Dynamically lowering the values returned by the rate-limit headers, and returning retry-after along with them can improve availability.
This sentence should be fixed, as it mixes two use-cases:
- the first, where a server reduces limits or increase reset to tell the client to "slow down"
- the second where the server tells the client to "stop"
A corner case is when the window is very low and is continuously reset (eg. 1 second): in these cases a 429 seems to me more a slow-down than a stop :) but that's corner case.
PS: did you register to IETF106? https://www.ietf.org/registration/ietf106/remotereg.py It would be great if you join the meeting for the RateLimit presentation!
It's up to the server to decide how to modify headers and communicate limits. This spec should just define the header names and their semantics.
Right. My suggestion in this context is hinting at the use of Retry-After
for transient conditions that are likely to be short-lived such as DoS and similar, rather than using the rate-limiting headers for that purpose in order to avoid redundancy and provide useful information usable after the condition disappears.
This sentence should be fixed, as it mixes two use-cases:
Ok, let me think about some rewording in the lines of the suggestions above and I'll submit a PR and reference this issue.
PS: did you register to IETF106? https://www.ietf.org/registration/ietf106/remotereg.py It would be great if you join the meeting for the RateLimit presentation!
Thanks for the link, I wasn't quite aware of the possibility of attending remotely. Just registered!
hinting at the use of Retry-After for [..] short-lived conditions such as DoS and similar, rather than using the rate-limiting headers
imho that depends whether:
- the server wants to slow-down the client advertising a reduced quota
- the server wants to push-back the client
We can provide some example request/responses and ask for feedback to the http community.
About IETF106 it would be great if we prepare the 10 minute talk together so that if I cannot attend for whatever reason you can take my place :)
About IETF106 it would be great if we prepare the 10 minute talk together so that if I cannot attend for whatever reason you can take my place :)
Sure, will ping you over IM.