gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

GEP: Describe Backend Properties

Open youngnick opened this issue 3 years ago • 7 comments

What would you like to be added:

As people have started implementing Gateway API, some functionality has come up that:

  • is tightly bound to the backend (which is usually a Service)
  • Can't easily be added to the Service resource

But that we need a way to represent in the API.

The first examples that came up are:

  • TLS details for (re)encryption between the Gateway and the backend. That is, the backend must have a way to store a serving TLS keypair that represents its service identity. (This one in particular has a lot of crossover with Mesh/GAMMA) use cases.
  • HTTP Websocket protocol needs to be able to be turned on somewhere in the API, but requires the backend implementing it to support it.

These are certainly not the only things that we'll need to add to this sort of construct though, so it must be extensible and designed with the future in mind.

This GEP covers the work to design and implement solutions to these problems.

As @shaneutt suggested, the first step will be implementing a provisional GEP that sets out the terms of what we're doing - the "What" and "Why" of the solution, and then once we're on the same page, we'll talk more about the "How".

Part of this GEP process should also be clarifying if we're using Policy Attachment for this, and why we've chosen to use it or not.

Why this is needed:

Layer 7 implementations already allow both of the first functionality, solving these in a more general way will give us a path towards adding more things.

youngnick avatar Jul 26 '22 00:07 youngnick

/kind gep

youngnick avatar Jul 26 '22 00:07 youngnick

Okay, I've started something rough in https://docs.google.com/document/d/1M8EPrZKuhYsQjHnVrdsBxU8mVPKUWUYLXW9qE_Et9-Y/edit?usp=sharing .

Remember that we're working first on a "what" and a "why", not on a "how" yet. Let's do a Provisional GEP agreeing on those things, then we can move to the "how".

youngnick avatar Jul 26 '22 07:07 youngnick

That is, the backend must have a way to store a serving TLS keypair that represents its service identity. (This one in particular has a lot of crossover with Mesh/GAMMA) use cases.

One interesting twist here is that for mesh cases it may be preferable for the TLS identity to be assigned by a mesh controller rather than specified manually (but maybe possible to override? manual specification likely needed for non-mesh cases too) which could suggest a status field for exposing this to consumers.

mikemorris avatar Jul 26 '22 18:07 mikemorris

Would it be reasonable for the mesh to fill in the TLS details in the ServiceMetadata? Perhaps a spec/status split where the mesh could fill in the status if the cert was not specified in spec.

evankanderson avatar Jul 27 '22 00:07 evankanderson

I mentioned this elsewhere but IMO application TLS and mesh encryption (which is not always tls...) should be kept entirely separated

howardjohn avatar Jul 27 '22 00:07 howardjohn

I think that having a space in the status of resources to put generated TLS details makes some sense, but I think that we need to resolve the confusion about what we're putting in spec first, when you do want to request. Then, any status info can be added to that same resource.

youngnick avatar Aug 01 '22 10:08 youngnick

As suggested by @mikemorris in Slack, it seems useful to describe why we ended up removing BackendPolicy earlier. Unfortunately we didn't leave much of a paper trail, but here's what I remember:

  • We didn't feel confident in this API
  • It had not been implemented by anyone
  • Many of the concepts we'd tried to add (health checks, timeouts, etc) were not as portable as we hoped and couldn't fit in this resource
  • We were trying to release v1alpha2 and this specific resource didn't seem to make the cut for that new API version
  • Policy attachment seemed like a promising pattern for this type of configuration, and this resource both did not use the policy attachment model and had naming that made it look like it should

As always, feel free to correct me if I got anything wrong or missed anything.

robscott avatar Aug 19 '22 00:08 robscott

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 17 '22 00:11 k8s-triage-robot

Hey, did GEP-1282 get checked in? Does it address the needs described in https://github.com/kubernetes-sigs/gateway-api/discussions/1244, particularly the ability to distinguish between upstreams that support HTTP/2 and / or websockets, vs those which do not?

evankanderson avatar Nov 30 '22 16:11 evankanderson

/remove-lifecycle stale

evankanderson avatar Nov 30 '22 16:11 evankanderson

My take away from the Nov 14 community meeting (notes,video) is the GEP is going through a redux with a reduced scope of dealing with backend TLS.

Use cases are being gathered here - https://docs.google.com/document/d/17sctu2uMJtHmJTGtBi_awGB0YzoCLodtR6rUNmKMCs8/edit

This doesn't address the needs described in https://github.com/kubernetes-sigs/gateway-api/discussions/1244 and no one is driving that at the moment.

dprotaso avatar Nov 30 '22 16:11 dprotaso

Thanks @dprotaso, that's correct.

youngnick avatar Dec 01 '22 00:12 youngnick

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 01 '23 00:03 k8s-triage-robot

/remove-lifecycle stale

I still care about this! As a refresher, for Knative, we'd like to be able to specify the following upstream server behavior using a consistent (Gateway-API conformance tested) interface:

  • Route HTTP requests to this upstream, expect the upstream to speak TLS using the following CA and SAN
    • This seems to be supported on (at least) Contour, Istio, Nginx and Traefik, though it's specified different ways for each -- sometimes in-line, sometimes as a separate object, sometimes as a wrapper around a Kubernetes Service.
  • Specify the flavors of HTTP request (minimum: http/2, websocket, http/2+websocket) that the upstream can support. This could simply be pass-through of the protocols, or it could be conversion of e.g. HTTP/3 to HTTP/1.1 or HTTP/2.
    • For Knative Serving, we could probably live with a conformance rule that said "must be able to pass through http/2+websocket` and then handle the conversion in Go code in our infrastructure before the user container.

If those should be separate GEPs, that seems fine.

evankanderson avatar Mar 17 '23 17:03 evankanderson

yeah, I think that this one has not gone well because I leaned too deep into combining everything into a single type, and then it had a large impact on many things. I'm working with @candita at the moment on how we can make something smaller, that just covers managing a TLS connection between the data plane and the backend, using the new Direct Attached Policy type that is being introduced in #1565 (once I get that merged, anyway).

Once we've got something smaller done there, I think we can look at if we continue adding smaller, tightly scoped Policy objects, or do something bigger with a more involved design.

youngnick avatar Mar 20 '23 04:03 youngnick

I'm splitting out the backend protocol selection from this GEP to https://github.com/kubernetes-sigs/gateway-api/issues/1911 TLS use cases have already been split out here https://github.com/kubernetes-sigs/gateway-api/issues/1897

dprotaso avatar Apr 05 '23 13:04 dprotaso