RevisionSpec.Timeout has deviated from the original specification
/area API
I dropped the ball regarding how we handled RevisionSpec.Timeout in the OSS implementation. We deviated from what was originally in the spec and opt'd to change the spec (https://github.com/knative/serving/pull/10806) rather than fix the behaviour.
We prioritized backwards compatibility over remaining conformant.
In hindsight this was a mistake since:
- This led us to introduced a new field to cover the functional gap - (#10851)
TimeoutandMaxDurationbeing so similar is confusing and bad UX for end-users- Workloads running on the OSS implementation will behave differently on other compliant Knative offerings.
New plan is here: https://github.com/knative/serving/issues/12634#issuecomment-1137765904
~I think the best path forward here is:~
- [x] ~https://github.com/knative/serving/pull/12635~
- ~Yes this is poor form but it's only been out for two weeks in the latest release at the time of writing and hasn't been documented~
- [ ] ~create a feature flag that enables the first byte timeout logic (on by default)~
- [ ] ~notify users about this default and be explicit they should set it for compatibility~
- [ ] ~after N releases we swap the default to be conformant to the spec's original intention~
- [ ] ~For those who opt into the first-byte timeout we provide a maxDuration annotation so they still have that ability~
- [ ] ~Allow a per-revision override via annotations so existing revisions can work independently of the operator settings~
/assign @dprotaso
It feels like "time to first response byte" is a valid but niche use case. Rather than introducing a flag to change the meaning of timeoutSeconds, have we considered introducing a firstByteMaxTime or the like?
Thinking about it, it seems like the main use case for fBMT is for server or bidirectional streaming scenarios, which are definitely not the default.
(I don't like firstByteMaxTime as a name, for the record.)
It still seems useful to be able to specify a first but and max stream duration at the Revision level, to enable safe lame-duck operation and bound times for proxy-level drain and restart without complicated TCP handoffs.
Rather than introducing a flag to change the meaning of timeoutSeconds, have we considered introducing a firstByteMaxTime or the like?
I had not. Though now I wonder if adding support for idleTimeout (https://github.com/knative/serving/issues/10852) would cover the niche case of first byte timeout
Timeout and MaxDuration being so similar is confusing and bad UX for end-users
It is confusing inded. IMHO MaxDuration is a self-explanatory name and seems better that Timeout to me.
Should we capture this in a design doc and describe how previous cases can be handled with the "new" design? It would help also documenting some of the scenarios eg. "if I use websockets then..." (diagrams would be nice too to explain options).
create a feature flag that enables the first byte timeout logic (on by default)
@dprotaso enable it to be taken into account by the timeout handler? what happens if its disabled, the behaviour uses the MaxDuration only and ignore first byte response?
I'm also wondering about the plan -- do we mean a feature flag that makes timeoutSeconds mean timeoutOnResponseStart, or a feature-flag for a new parameter (idleTimeoutSeconds or startResponseTimeout)?
so here is my proposal:
Based on the user choice, the timeout handler would use the FirstByte or MaxDuration.
and I think since the MaxDuration seems to be the more common use case, we can make it the default, but at the same time allow the user to choose FirstByte if needed.
This new behaviour can start behind a feature gate for a release or two, then be on by default.
We can allow the user to choose using an annotation on the revision?
Circling back - our timeout to first byte logic is currently broken for web sockets. When using web sockets the connection will remain open for Timeout duration - which conforms to what the spec originally had.
user reported issue: https://knative.slack.com/archives/CA9RHBGJX/p1645566907713149
This simplifies the original plan since the earlier concern we had was about breaking compatibility with long running web sockets connections isn't an issue.
Thus all I think we need to do is:
- [ ] https://github.com/knative/specs/pull/100
- [ ] 1.5 release - Mention the upcoming change in the 1.5 release notes
- [ ] 1.6 release - Fix our timeout handlers to max duration instead of time to first byte.
- [ ] 1.6 release - Update our conformance tests to assert max duration behaviour of
Timeout - [ ] 1.6 release - Add an e2e test to handle timeouts for different protocols
but at the same time allow the user to choose FirstByte if needed.
I think this would be covered as a new property - IdleTimeout - https://github.com/knative/serving/issues/10852
Nader is going to drive this for the 1.6 release
/assign @nader-ziada /unassign @dprotaso
With https://github.com/knative/serving/issues/11980 we should now be able to make the necessary changes for 1.7