Queue Proxy health checks incompatible with non-HTTP/2 applications
/area networking
What version of Knative?
1.15.0
Expected Behavior
Legacy applications may have undefined behavior when HTTP/2 upgrade requests are made. Knative should gracefully handle those errors and downgrade the health check attempt to HTTP/1 or HTTP/1.1.
Actual Behavior
Applications which do not support HTTP/2 will not handle the upgrade request properly. In our case, a legacy application returns a 500 when OPTIONS are sent to upgrade the connection. Knative fails the entire healthcheck because of this, even if the same check over HTTP/1 or HTTP/1.1 will properly return a 200.
Steps to Reproduce the Problem
- Create an application which does not support HTTP/2 or returns a 500 on the OPTIONS request
- Notice that Knative will start failing the health checks and the pod will be killed
Additional Context
It is not within the Kubernetes spec that an application must support HTTP/2 or that it should expect an OPTIONS call to its health/liveness probes. Only GET is part of the contract, which the Queue Proxy does not follow.
I believe the logic is flawed in the queue proxy's HTTP probes here. https://github.com/knative/serving/blob/873602a410ce54db05d9fb6caab121e5824dbe41/pkg/queue/health/probe.go#L155
When an error occurs during the upgrade, maxProto should be set to 1 and Knative should stop trying to make HTTP/2 requests. Currently because of this line, HTTP/2 will be retried indefinitely and HTTP/1 will never be attempted.
I'm confused what's making HTTP2 requests? Knative healthchecks are HTTP/1
@dprotaso I can see requests being made from the queue-proxy to the user-container and attempting to upgrade to HTTP/2 during the readiness probes.
And the code I linked above I believe is the logic for the queue-proxy to perform the HTTP/2 upgrade for these probes. This happens when the feature gate for auto-detecting HTTP2 is set to true
oh interesting - i didn't realize this was added. h2c upgrade is deprecated https://datatracker.ietf.org/doc/html/rfc9113#section-3.1
We should probably just always be doing HTTP/1 unless the user has specified h2c OR we change the detection to use h2c prior knowledge
You don't have an example app where this breaks?
I agree that probes should have always been HTTP/1 to match what would be expected from Kubernetes. But if you want this to remain so you can tell if an app supports HTTP/2 or not, then I would suggest at least gracefully failing if the HTTP/2 check fails (fallback to HTTP/1).
Unfortunately I don't have a sample that I could share, but I think it should be reproducible if you just had an app that throws a 500 whenever an OPTIONS request is made (ie, the upgrade request)
Hi @braunsonm, thanks for reporting this.
This happens when the feature gate for auto-detecting HTTP2 is set to true
Would it work if you turn this off for now or is this something that fails in other scenarios?
Would it work if you turn this off for now or is this something that fails in other scenarios?
It does work if it is set to false, but that does mean other applications deployed on Knative can no longer benefit from HTTP/2 which is unfortunate.
but that does mean other applications deployed on Knative can no longer benefit from HTTP/2 which is unfortunate.
That autodetect feature was never completed. So if the app is using http2 you mean that QP is not going to use it with autodetect= off? What do you mean apps on Knative cant benefit from HTTP/2, could you elaborate?
What do you mean apps on Knative cant benefit from HTTP/2, could you elaborate?
I was under the impression that autodetecting HTTP2 feature was required for HTTP2 to be used between the activator and ksvc's. Is that not true?
This is has to do with probes here. We do support http2 without setting that auto-detect property which btw is not done as a feature (check our grpc tests for example). Also see here on what happens when you turn that on: https://github.com/knative/serving/blob/main/pkg/queue/readiness/probe.go#L233-L242. We only try the upgrade if maxProto = 0 see https://github.com/knative/serving/blob/main/pkg/queue/health/probe.go#L163 cc @dprotaso if has more to add for the background info of this feature
Right now to support HTTP2 requires people to set the containerPort name to be h2c.
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: grpc-ping
namespace: default
spec:
template:
spec:
containers:
- image: docker.io/{username}/grpc-ping-go
ports:
- name: h2c
containerPort: 8080
The feature has an issue here https://github.com/knative/serving/issues/4283 - the idea is to detect the protocol without the labelling
I see. We use func which doesn't support naming the port so that's why the autodetection was going to be required for us.
@braunsonm is this something functions could help with instead? Do you mind opening an issue there too?
No it is not. @skonto this is broken in functions because of the flawed implementation in serving.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.
/lifecycle frozen
seems like there are some golang changes in 1.24 (due out feb) that might make this simpler
https://tip.golang.org/doc/go1.24#nethttppkgnethttp
I am looking into this and it seems that there isn't a simple fix as the connection upgrade mechanism from http to h2c was deprecated in RFC 9113 released in June 2022 (also mentioned by Dave above):
The "h2c" string was previously used as a token for use in the HTTP Upgrade mechanism's Upgrade header field (Section 7.8 of [HTTP]). This usage was never widely deployed and is deprecated by this document. The same applies to the HTTP2-Settings header field, which was used with the upgrade to "h2c".
This issue is basically caused by relying on this behavior here: https://github.com/knative/serving/blob/a2a2441bb8488b6059fd0d4af3f290ea979e67d4/pkg/queue/health/probe.go#L136-L140
We also use the h2c package for which this deprecation proposal exists partly for the same reasons: https://github.com/golang/go/issues/72039
IMO it doesn't make sense to workaround the legacy connection upgrade behavior, I don't know the history of how we support http2 in serving yet but would naively propose to:
- remove the deprecated connection upgrade logic in our probe
- implement a different way of detecting if the endpoint supports http2 and/or http1
- replace the
h2c/http2package used in our tests with the go standard library equivalents
Let me know in case somebody has better/different idea on how we want handle this.
Related to:
- https://github.com/knative/serving/issues/10962
- https://github.com/knative/serving/issues/4283
/assign
From the go issue with 1.24 we now have
Go 1.24 includes support for unencrypted HTTP/2 client and server connections. Setting Server.Protocols to a value which includes the UnencryptedHTTP2 protocol enables support for serving unencrypted HTTP/2 connections. Setting Transport.Protocols to a value which includes UnencryptedHTTP2 and not HTTP1 causes the client to use unencrypted HTTP/2 for requests for http:// URLs.