serving icon indicating copy to clipboard operation
serving copied to clipboard

Queue Proxy health checks incompatible with non-HTTP/2 applications

Open braunsonm opened this issue 1 year ago • 18 comments

/area networking

What version of Knative?

1.15.0

Expected Behavior

Legacy applications may have undefined behavior when HTTP/2 upgrade requests are made. Knative should gracefully handle those errors and downgrade the health check attempt to HTTP/1 or HTTP/1.1.

Actual Behavior

Applications which do not support HTTP/2 will not handle the upgrade request properly. In our case, a legacy application returns a 500 when OPTIONS are sent to upgrade the connection. Knative fails the entire healthcheck because of this, even if the same check over HTTP/1 or HTTP/1.1 will properly return a 200.

Steps to Reproduce the Problem

  1. Create an application which does not support HTTP/2 or returns a 500 on the OPTIONS request
  2. Notice that Knative will start failing the health checks and the pod will be killed

Additional Context

It is not within the Kubernetes spec that an application must support HTTP/2 or that it should expect an OPTIONS call to its health/liveness probes. Only GET is part of the contract, which the Queue Proxy does not follow.

I believe the logic is flawed in the queue proxy's HTTP probes here. https://github.com/knative/serving/blob/873602a410ce54db05d9fb6caab121e5824dbe41/pkg/queue/health/probe.go#L155

When an error occurs during the upgrade, maxProto should be set to 1 and Knative should stop trying to make HTTP/2 requests. Currently because of this line, HTTP/2 will be retried indefinitely and HTTP/1 will never be attempted.

braunsonm avatar Jul 31 '24 13:07 braunsonm

I'm confused what's making HTTP2 requests? Knative healthchecks are HTTP/1

dprotaso avatar Jul 31 '24 20:07 dprotaso

@dprotaso I can see requests being made from the queue-proxy to the user-container and attempting to upgrade to HTTP/2 during the readiness probes.

And the code I linked above I believe is the logic for the queue-proxy to perform the HTTP/2 upgrade for these probes. This happens when the feature gate for auto-detecting HTTP2 is set to true

braunsonm avatar Jul 31 '24 20:07 braunsonm

oh interesting - i didn't realize this was added. h2c upgrade is deprecated https://datatracker.ietf.org/doc/html/rfc9113#section-3.1

We should probably just always be doing HTTP/1 unless the user has specified h2c OR we change the detection to use h2c prior knowledge

dprotaso avatar Jul 31 '24 20:07 dprotaso

You don't have an example app where this breaks?

dprotaso avatar Jul 31 '24 20:07 dprotaso

I agree that probes should have always been HTTP/1 to match what would be expected from Kubernetes. But if you want this to remain so you can tell if an app supports HTTP/2 or not, then I would suggest at least gracefully failing if the HTTP/2 check fails (fallback to HTTP/1).

Unfortunately I don't have a sample that I could share, but I think it should be reproducible if you just had an app that throws a 500 whenever an OPTIONS request is made (ie, the upgrade request)

braunsonm avatar Jul 31 '24 20:07 braunsonm

Hi @braunsonm, thanks for reporting this.

This happens when the feature gate for auto-detecting HTTP2 is set to true

Would it work if you turn this off for now or is this something that fails in other scenarios?

skonto avatar Aug 01 '24 13:08 skonto

Would it work if you turn this off for now or is this something that fails in other scenarios?

It does work if it is set to false, but that does mean other applications deployed on Knative can no longer benefit from HTTP/2 which is unfortunate.

braunsonm avatar Aug 01 '24 14:08 braunsonm

but that does mean other applications deployed on Knative can no longer benefit from HTTP/2 which is unfortunate.

That autodetect feature was never completed. So if the app is using http2 you mean that QP is not going to use it with autodetect= off? What do you mean apps on Knative cant benefit from HTTP/2, could you elaborate?

skonto avatar Aug 01 '24 14:08 skonto

What do you mean apps on Knative cant benefit from HTTP/2, could you elaborate?

I was under the impression that autodetecting HTTP2 feature was required for HTTP2 to be used between the activator and ksvc's. Is that not true?

braunsonm avatar Aug 01 '24 14:08 braunsonm

This is has to do with probes here. We do support http2 without setting that auto-detect property which btw is not done as a feature (check our grpc tests for example). Also see here on what happens when you turn that on: https://github.com/knative/serving/blob/main/pkg/queue/readiness/probe.go#L233-L242. We only try the upgrade if maxProto = 0 see https://github.com/knative/serving/blob/main/pkg/queue/health/probe.go#L163 cc @dprotaso if has more to add for the background info of this feature

skonto avatar Aug 01 '24 14:08 skonto

Right now to support HTTP2 requires people to set the containerPort name to be h2c.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: grpc-ping
  namespace: default
spec:
  template:
    spec:
      containers:
      - image: docker.io/{username}/grpc-ping-go
        ports:
          - name: h2c
            containerPort: 8080

The feature has an issue here https://github.com/knative/serving/issues/4283 - the idea is to detect the protocol without the labelling

dprotaso avatar Aug 01 '24 16:08 dprotaso

I see. We use func which doesn't support naming the port so that's why the autodetection was going to be required for us.

braunsonm avatar Aug 01 '24 16:08 braunsonm

@braunsonm is this something functions could help with instead? Do you mind opening an issue there too?

skonto avatar Oct 24 '24 09:10 skonto

No it is not. @skonto this is broken in functions because of the flawed implementation in serving.

braunsonm avatar Oct 24 '24 17:10 braunsonm

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Jan 23 '25 01:01 github-actions[bot]

/lifecycle frozen

seems like there are some golang changes in 1.24 (due out feb) that might make this simpler

https://tip.golang.org/doc/go1.24#nethttppkgnethttp

dprotaso avatar Jan 23 '25 21:01 dprotaso

I am looking into this and it seems that there isn't a simple fix as the connection upgrade mechanism from http to h2c was deprecated in RFC 9113 released in June 2022 (also mentioned by Dave above):

The "h2c" string was previously used as a token for use in the HTTP Upgrade mechanism's Upgrade header field (Section 7.8 of [HTTP]). This usage was never widely deployed and is deprecated by this document. The same applies to the HTTP2-Settings header field, which was used with the upgrade to "h2c".

This issue is basically caused by relying on this behavior here: https://github.com/knative/serving/blob/a2a2441bb8488b6059fd0d4af3f290ea979e67d4/pkg/queue/health/probe.go#L136-L140

We also use the h2c package for which this deprecation proposal exists partly for the same reasons: https://github.com/golang/go/issues/72039

IMO it doesn't make sense to workaround the legacy connection upgrade behavior, I don't know the history of how we support http2 in serving yet but would naively propose to:

  1. remove the deprecated connection upgrade logic in our probe
  2. implement a different way of detecting if the endpoint supports http2 and/or http1
  3. replace the h2c/http2 package used in our tests with the go standard library equivalents

Let me know in case somebody has better/different idea on how we want handle this.

Related to:

  • https://github.com/knative/serving/issues/10962
  • https://github.com/knative/serving/issues/4283

/assign

linkvt avatar Oct 27 '25 13:10 linkvt

From the go issue with 1.24 we now have

Go 1.24 includes support for unencrypted HTTP/2 client and server connections. Setting Server.Protocols to a value which includes the UnencryptedHTTP2 protocol enables support for serving unencrypted HTTP/2 connections. Setting Transport.Protocols to a value which includes UnencryptedHTTP2 and not HTTP1 causes the client to use unencrypted HTTP/2 for requests for http:// URLs.

dprotaso avatar Oct 28 '25 14:10 dprotaso