gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

Define how Gateways should or should not interact with GAMMA routing configuration

Open mikemorris opened this issue 3 years ago • 23 comments

What would you like to be added:

As a followup to #1426, there is a need to clarify how traffic ingressing through a Gateway should or should not respect GAMMA routing configuration, specifically in the case when a Service specified as a backendRef of an HTTPRoute with a Gateway parentRef may have separately-configured mesh routing rules from an HTTPRoute specifying the Service as a parentRef.

An initial draft of a proposal to address this has been started in https://docs.google.com/document/d/16GZj-XFt6sAi4tMUy9Ckr99znIm6Hy0W0VeawJUdWRw/edit#

Why this is needed:

There are at least two possible approaches to handling this - expecting or allowing a Gateway to implicitly respect GAMMA routing rules (which may be difficult for Gateway API implementations focused on N/S use cases, or when mixing N/S and E/W implementations from different vendors), or requiring more explicit configuration. We should clarify the expected behavior here to facilitate GAMMA implementation.

mikemorris avatar Oct 26 '22 19:10 mikemorris

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 24 '23 21:01 k8s-triage-robot

/assign

mikemorris avatar Jan 31 '23 16:01 mikemorris

/assign @kflynn

mikemorris avatar Jan 31 '23 16:01 mikemorris

/remove-lifecycle stale

keithmattix avatar Feb 07 '23 21:02 keithmattix

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 08 '23 21:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jun 29 '23 19:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Jan 19 '24 05:01 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 19 '24 05:01 k8s-ci-robot

/reopen /remove-lifecycle rotten /lifecycle staleproof

craigbox avatar Jul 25 '24 21:07 craigbox

@craigbox: Reopened this issue.

In response to this:

/reopen /remove-lifecycle rotten /lifecycle staleproof

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 25 '24 21:07 k8s-ci-robot

Hi @craigbox! We see you've re-opened this issue. Generally speaking on this project we ask that maintainers be involved/consulted in the decision to re-open closed issues, as we are the ones that have to prioritize and work on the logistics. That said, we can definitely make exceptions when needed! We appreciate your interest in the issue, and were wondering: out of curiosity, is this something that you're interested in personally contributing to in order to help move forward?

shaneutt avatar Aug 09 '24 12:08 shaneutt

I will have re-opened this because @mikemorris mentioned it here.

(Maintainers didn't close the issue, the passage of time did, and personally I have little love for that. The problem still exists, even if it's not currently being worked on or tracked.)

craigbox avatar Aug 12 '24 00:08 craigbox

We can understand how community members like yourself may be pained by seeing issues auto-close, especially if its something you're wanting to see implemented. It is however the case that the project has limited resources (effectively running on volunteer time) and we simply can not prioritize and move forward with all issues. We know this can be frustrating, and we are sorry for that frustration, but "Closed" is sometimes the most honest and realistic reflection of the accurate state of an issue in terms of priority and project management.

Ideally we would ask that community members please consider putting a closed issue on the meeting agenda or mailing list to discuss it there, or be personally willing to invest time in moving something forward prior to bumping it back open as this can be more optimal for breathing new life into that issue, sharing context and perhaps garnering support from people who will be the ones to work directly on it.

All the above said for the general case, I do think perhaps this is a bit of a special case: there were two people assigned to the issue prior to its closing:

@mikemorris and @kflynn what are your thoughts on this issue?

/triage needs-information

shaneutt avatar Aug 12 '24 12:08 shaneutt

I think that this is still relevant, and that our new(ish)ly better-defined GEP process is the right way to tackle it, given that we know it's relevant now, that it's becoming more relevant with cloud gateways, and that there are some fascinating cans of worms lurking behind the innocuous title of this issue.

To that end, I'll organize some thoughts and open a discussion. Let's leave this open until that happens.

kflynn avatar Aug 17 '24 02:08 kflynn

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 15 '24 03:11 k8s-triage-robot

/remove-lifecycle stale pending https://github.com/kubernetes-sigs/gateway-api/issues/1478#issuecomment-2294534288

craigbox avatar Nov 16 '24 02:11 craigbox

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 14 '25 03:02 k8s-triage-robot

/remove-lifecycle stale

This is a thing I’d like to fit into the next release cycle. 🤞

kflynn avatar Feb 14 '25 03:02 kflynn

/triage accepted

shaneutt avatar Mar 25 '25 15:03 shaneutt

One thing that I ran into recently with this is that we currently have no conformance tests that check basic assumptions we've made about how parentRefs work, namely:

  • we don't have a conformance test that checks that Gateway implementations don't set status for Service parentRefs
  • we don't have a conformance test that checks that Mesh implementations don't set status for Gateway parentRefs
  • a more general test that implementations don't update status for parentRefs that don't roll up to a valid Gateway (or Service).

Cilium had bugs where we were doing all of these ☹ .

I have some rough designs for these based on tests I had to add to Cilium that I'll be looking to upstream post-Kubecon.

youngnick avatar Mar 26 '25 02:03 youngnick

we don't have a conformance test that checks that Mesh implementations don't set status for Gateway parentRefs

This isn't a blanket rule though, it would be "for Gateway parentRefs that roll up to a GatewayClass not associated with the mesh" - Istio definitely sets status on its own ingress gateways.

But yes agreed in principle, would be glad to review/collab post-KubeCon on getting these in.

mikemorris avatar Mar 28 '25 21:03 mikemorris

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 26 '25 22:06 k8s-triage-robot

/remove-lifecycle stale

craigbox avatar Jun 27 '25 00:06 craigbox

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 25 '25 00:09 k8s-triage-robot

There are likely still some additions to conformance tests desired (as mentioned in https://github.com/kubernetes-sigs/gateway-api/issues/1478#issuecomment-2753089628, which probably belongs under the scope of https://github.com/kubernetes-sigs/gateway-api/issues/3566), but some aspects of this issue have been clarified substantially since this issue was created:

  • Many N/S Gateway API implementations want to route to directly to endpoints for "sticky session" functionality in ways that would bypass Service-oriented mesh configuration.
  • A mesh may be able to enforce "inbound" behaviors like AuthZ and rate limiting, but programming "outbound" behaviors for a N/S Gateway can be more complicated, as described in https://gateway-api.sigs.k8s.io/geps/gep-3792/#4-the-outbound-behavior-problem. For in-cluster implementations this can sometimes be solved by either sticking a sidecar proxy next to the ingress to route all ingress traffic through the mesh, or having a fully-integrated ingress capability in a mesh product and programming the N/S Gateway based on awareness of mesh configuration. However, neither of these options are typically available for integrating off-cluster gateways into a mesh.

Because of these issues (and the potential increased burden to N/S Gateway API implementations of expecting awareness of mesh configuration, rather than configuring something like mesh mTLS integration as an additive feature), I think we've mostly landed on preferring explicit configuration (e.g. HTTPRoutes with both a Service and Gateway parentRef for simple cases), and expecting that Gateway routing decisions should not need to parse or follow configuration in routes not attached to the Gateway, such as E/W rules with only a Service parentRef specified.

@kflynn @LiorLieberman @howardjohn does it feel like we've made enough progress on this question that this issue could be closed yet?

mikemorris avatar Oct 13 '25 21:10 mikemorris

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 12 '25 22:11 k8s-triage-robot