gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

Conformance - Support for TLS-Encrypted Test Cases in GRPCRoute

Open tao12345666333 opened this issue 1 year ago • 8 comments

What happened:

As we integrate and evolve our implementation to align with the Gateway API's standards, we have encountered a notable scenario that I believe warrants consideration for the broader community and the future direction of GRPCRoute conformance tests.

Background: In the course of updating our implementation KIC to support the latest iterations of Gateway API, specifically around GRPCRoute, we have observed that the current conformance tests predominantly anticipate unencrypted (plaintext) gRPC traffic.

This observation is based on the default behavior in our own, where the default protocol for GRPCRoute has been set to grpcs to accommodate secure communication practices, which is a deviation from what the current conformance tests seem to expect.

Reference to our implementation details can be found here: KIC GRPCRoute Translation

What you expected to happen:

Given the increasing adoption of TLS/SSL to secure gRPC traffic in production environments, we propose the inclusion of TLS-encrypted test cases within the GRPCRoute conformance testing suite. This addition would not only reflect the real-world usage scenarios more accurately but also encourage implementations to support secure gRPC communication.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

  • The gRPC client used in conformance test https://github.com/kubernetes-sigs/gateway-api/blob/400e36da6929b98674e1d71bdd7eb65ca72e7438/conformance/utils/grpc/grpc.go#L144-L160
  • The GRPCRoute GEP-1016 methioned that the H2C (gRPC without TLS Encrypted) is more for testing. The protocol that is not mandatory to use in the standard is grpc or grpcs.
  • I noticed that some implementations (e.g. cilium, envoy-gateway) have passed the GRPCRoute conformance test, which is because they default to support H2C.

tao12345666333 avatar Apr 11 '24 19:04 tao12345666333

/area conformance

tao12345666333 avatar Apr 11 '24 19:04 tao12345666333

we propose the inclusion of TLS-encrypted test cases within the GRPCRoute conformance testing suite

For starters, I 100% agree with this.

where the default protocol for GRPCRoute

I don't think I'm fully following here. Where is a default coming into play? The choice of H2C vs HTTP/2 with TLS should be based on the protocol field of the listener on the Gateway. (reference in the GEP). AFAICT, the current set of conformance tests only uses Gateways that explicitly set the protocol type to HTTP and so the connection should be H2C.

Arguably, based on the GEP, conformant implementations need only support one of HTTP and HTTPS. When a listener with an unsupported type is encountered, the implementation should raise a "Detached" condition for the affected listener with a reason of "UnsupportedProtocol". This should maybe be reflected by two different features: one for H2C and one for HTTP/2 with TLS?

@robscott @youngnick Thoughts on having two separate features for this?

Can you help me understand what you mean by "default"? This is a default value for what option within the Gateway API?

gnossen avatar Apr 11 '24 21:04 gnossen

@robscott @youngnick Thoughts on having two separate features for this?

I'd rather avoid splitting this into separate features unless we run into implementations that can't support one or the other. With that said, if an outcome of this is that KIC can only support h2 and not h2c, I'd be open to a split.

robscott avatar Apr 11 '24 21:04 robscott

Can you help me understand what you mean by "default"? This is a default value for what option within the Gateway API?

Sorry, I may not be clear enough. What I mean is that KIC (kong ingress controller) sets the proxy protocol to grpcs by default.

This contains some background, in fact all NGINX-based implementations are affected by this.

Prior to NGINX 1.25.1, it was not possible to have h2c and http/1.1 listen on the same port. This means that if an NGINX instance that wants to proxy grpc (h2c) on port 80, then it cannot server http/1.1 on port 80.

So in this case, it can usually only provide grpcs and https proxies on port 443 at the same time.

This is why KIC chose to hardcode it to grpcs when setting up the GRPCRoute proxy.


Since Kong Gateway 3.6(the latest version), it based on NGINX v1.25.3.1, that means it can server grpc(h2c) and http/1.1 on the same port 80.

KIC can make changes, but it is equivalent to changing the default behavior of KIC, so this is the part we are worried about.

But this is not the main part of my discussion in this issue.


When we do conformance testing, our goal is to confirm whether the traffic can reach the target service as expected.

So whether it is through h2c protocol proxy or grpcs protocol proxy is optional. These implementations only need to implement one or both of them, and they should be considered to pass the conformance test.

The GWAPI conformance test only has h2c test cases, but no grpcs test cases. I think this needs to be added.

tao12345666333 avatar Apr 12 '24 02:04 tao12345666333

I'd rather avoid splitting this into separate features unless we run into implementations that can't support one or the other. With that said, if an outcome of this is that KIC can only support h2 and not h2c, I'd be open to a split.

Thank you @robscott

As I said above, KIC can only provide both h2 and h2c proxy support when used with Kong Gateway version 3.6 or above. (Without making any other modifications)

In addition, we also need to modify the default behavior of KIC.

tao12345666333 avatar Apr 12 '24 03:04 tao12345666333

@tao12345666333 I think https://github.com/kubernetes-sigs/gateway-api/pull/2968 should resolve your immediate issue as that update means that the only correct way to implement these tests would be with h2c. I think you can safely override your default behavior when appProtocol is set to clearly request h2c.

robscott avatar Apr 12 '24 17:04 robscott

YES, SGTM.

I will make changes in KIC to support h2c. Do we want to add TLS-encrypted test case for GRPCRoute?

tao12345666333 avatar Apr 13 '24 21:04 tao12345666333

Do we want to add TLS-encrypted test case for GRPCRoute?

Yes. These additional test cases should be coming soon.

gnossen avatar Apr 13 '24 21:04 gnossen

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 12 '24 22:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 11 '24 22:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Sep 10 '24 23:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 10 '24 23:09 k8s-ci-robot