cert-manager icon indicating copy to clipboard operation
cert-manager copied to clipboard

Gateway: Combining HTTPS listener with TLS-termination and TLS listener with TLS-passthrough

Open vehagn opened this issue 1 year ago • 5 comments

Describe the bug:

I'm trying to create a Gateway where I use both a HTTPS listener with a certificate provided by Cert-manager, and a TLS listener with TLS-passthrough.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.example.com"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox.example.com"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

When I add the TLS listener the Gateway becomes unresponsive for all HTTPRoutes and TLSRoutes connected to it. The event log for the the Gateway states:

Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

Expected behaviour:

I expect the Gateway to work with both listeners. Cert-manager should allow/ignore the TLS listener running in Passthrough mode.

Steps to reproduce the bug:

Create the above Gateway.

Anything else we need to know?:

Cilium 1.15.1 provides the GatewayClass. I initially believed this to be a Cilium-issue, but with further investigation it looks to be an issue with Cert-manager.

A workaround is to create two Gateways, each with their own listener. Alternatively route the HTTPS listener Gateway Service through to the TLS listener Gateway to only expose one LoadBalancer IP.

Environment details::

  • Kubernetes version: 1.29.3
  • Cloud-provider/provisioner: Bare metal
  • cert-manager version: 1.14.4
  • Install method: Kustomize + Helm
# kustomization.yaml
helmCharts:
  - name: cert-manager
    repo: https://charts.jetstack.io
    version: 1.14.4
    includeCRDs: true
    releaseName: cert-manager
    namespace: cert-manager
    valuesFile: values.yaml
# values.yaml
installCRDs: true

config:
  apiVersion: controller.config.cert-manager.io/v1alpha1
  kind: ControllerConfiguration
  featureGates:
    ExperimentalGatewayAPISupport: true

/kind bug

vehagn avatar May 04 '24 08:05 vehagn

Hey @vehagn thanks for raising. Where does this log come from?

Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

Does the TLS listener prevent the Certificate being created for the HTTPS listener in the single file example?

hawksight avatar May 07 '24 12:05 hawksight

Thanks for picking up the issue @hawksight,

The log comes from the events when you run kubectl describe on the Gateway resource.

I’m unsure of what you mean by “single file example.” I think the certificate is successfully created for the HTTPS-listener, it could be from before adding the TLS-listener (I can double check later).

My main issue is that HTTPRoutes connected to the Gateway becomes unresponsive, meaning that I can’t access the Services behind them.

I tried to solve the issue in the linked PR, but haven’t built an image of the branch and tested it fully.

vehagn avatar May 07 '24 15:05 vehagn

"single file example" I meant this YAML:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.example.com"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox.example.com"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

I was trying to understand if cert-manager was actually preventing the Gateway from working, or if that is a Gateway concern. I'll look at the PR more closely to understand the change.

Can you share how you have your Gateway installed? Is it via the standard YAML & CRDs or via a particular project that implements the GatewayAPI?

hawksight avatar May 08 '24 15:05 hawksight

Thanks for the clarification @hawksight.

I've done some more testing which I try to explain in detail below.

The testing leads me to believe the connectivity issues might be linked to Cilium Issue #32371. Though I still think the BadConfig Warning message I've attempted to fix in PR #6986 for TLSRoutes in Passthrough mode is an improvement.


I'm fetching the Gateway CRDs from https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/experimental-install.yaml. I'm using the experimental install since I want to use the Gateway spec.infrastructure.annotations field to explicitly set the Gateway Service IP. I've omitted this field in the test Gateway.

The full configuration can be found at https://gitlab.com/vehagn/mini-homelab

Doing a new test I'm first creating a Gateway without the Cert-manager annotation

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
#  annotations:
#    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.stonegarden.dev"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox-test.euclid.stonegarden.dev"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

and a TLSRoute

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
  name: test
  namespace: proxmox
spec:
  parentRefs:
    - name: test
      namespace: gateway
  hostnames:
    - "proxmox-test.euclid.stonegarden.dev"
  rules:
    - backendRefs:
        - name: proxmox-euclid
          port: 443

I'm now able to reach proxmox-test.euclid.stonegarden.dev through the Gateway.

Next I add the Cert-manager annotation (uncomment the above Gateway)

  annotations:
    cert-manager.io/issuer: cloudflare-issuer

Running kubectl describe on the Gateway I now get

❯ kubectl -n gateway describe gateway test
Name:         test
Namespace:    gateway
Labels:       argocd.argoproj.io/instance=gateway
Annotations:  argocd.argoproj.io/tracking-id: gateway:gateway.networking.k8s.io/Gateway:gateway/test
              cert-manager.io/issuer: cloudflare-issuer
API Version:  gateway.networking.k8s.io/v1
Kind:         Gateway
Metadata:
  Creation Timestamp:  2024-05-09T09:08:12Z
  Generation:          2
  Resource Version:    8865763
  UID:                 8437c2e5-b7e1-4d71-b5d5-15995fe4faa5
Spec:
  Gateway Class Name:  cilium
  Listeners:
    Allowed Routes:
      Namespaces:
        From:  All
    Hostname:  *.stonegarden.dev
    Name:      https-gateway
    Port:      443
    Protocol:  HTTPS
    Tls:
      Certificate Refs:
        Group:  
        Kind:   Secret
        Name:   test-cert
      Mode:     Terminate
    Allowed Routes:
      Namespaces:
        From:  All
    Hostname:  proxmox.euclid.stonegarden.dev
    Name:      proxmox-tls-passthrough
    Port:      443
    Protocol:  TLS
    Tls:
      Mode:  Passthrough
Status:
  Addresses:
    Type:   IPAddress
    Value:  192.168.1.221
  Conditions:
    Last Transition Time:  2024-05-09T09:13:15Z
    Message:               Gateway successfully scheduled
    Observed Generation:   2
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2024-05-09T09:13:15Z
    Message:               Gateway successfully reconciled
    Observed Generation:   2
    Reason:                Programmed
    Status:                True
    Type:                  Programmed
  Listeners:
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Programmed
      Observed Generation:   2
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Accepted
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Resolved Refs
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Name:                    https-gateway
    Supported Kinds:
      Group:          gateway.networking.k8s.io
      Kind:           HTTPRoute
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Programmed
      Observed Generation:   2
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Accepted
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Resolved Refs
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Name:                    proxmox-tls-passthrough
    Supported Kinds:
      Group:  gateway.networking.k8s.io
      Kind:   TLSRoute
Events:
  Type     Reason             Age                From                       Message
  ----     ------             ----               ----                       -------
  Normal   CreateCertificate  66s                cert-manager-gateway-shim  Successfully created Certificate "test-cert"
  Warning  BadConfig          54s (x9 over 66s)  cert-manager-gateway-shim  Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

I can still access proxmox-test.euclid.stonegarden.dev and I see that the certificate is created successfully. Interestingly both listeners report one attached route.

The I comment out the Cert-manager annotation again

#  annotations:
#    cert-manager.io/issuer: cloudflare-issuer

and create a HTTPRoute for the Gateway.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: http-route
  namespace: whoami
spec:
  parentRefs:
    - name: test
      namespace: gateway
  hostnames:
    - "https-test.stonegarden.dev"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: whoami
          port: 80

I now get ERR_CONNECTION_RESET when trying to access https-test.stonegarden.dev. The TLSRoute endpoint proxmox-test.euclid.stonegarden.dev still works.

The HTTPRoute status indicates that it should work.

status:
  parents:
    - conditions:
        - lastTransitionTime: '2024-05-09T22:28:45Z'
          message: Accepted HTTPRoute
          observedGeneration: 2
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:45Z'
          message: Service reference is valid
          observedGeneration: 2
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      controllerName: io.cilium/gateway-controller
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: test
        namespace: gateway

and the Gateway report two routes attached to the HTTPS-listener

status:
  addresses:
    - type: IPAddress
      value: 192.168.1.221
  conditions:
    - lastTransitionTime: '2024-05-09T22:28:36Z'
      message: Gateway successfully scheduled
      observedGeneration: 11
      reason: Accepted
      status: 'True'
      type: Accepted
    - lastTransitionTime: '2024-05-09T22:28:36Z'
      message: Gateway successfully reconciled
      observedGeneration: 11
      reason: Programmed
      status: 'True'
      type: Programmed
  listeners:
    - attachedRoutes: 2
      conditions:
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Programmed
          observedGeneration: 11
          reason: Programmed
          status: 'True'
          type: Programmed
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Accepted
          observedGeneration: 11
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Resolved Refs
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      name: https-gateway
      supportedKinds:
        - group: gateway.networking.k8s.io
          kind: HTTPRoute
    - attachedRoutes: 1
      conditions:
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Programmed
          observedGeneration: 11
          reason: Programmed
          status: 'True'
          type: Programmed
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Accepted
          observedGeneration: 11
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Resolved Refs
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      name: proxmox-tls-passthrough
      supportedKinds:
        - group: gateway.networking.k8s.io
          kind: TLSRoute

Then I uncomment the Cert-manager annotation again

  annotations:
    cert-manager.io/issuer: cloudflare-issuer

And I can still connect to the TLSRoute endpoint, but not the HTTPRoute endpoint.

Deleting the Gateway and waiting for Argo CD to recreate it the TLSRoute endpoint also stops working.

Deleting the Gateway and waiting for Argo CD to recreate again it the TLSRoute endpoint now works again.

Deleting and recreating the gateway appears to continue this flip-flop pattern.

Cert-manager diligently reattaches the certificate it created earlier each time.


Edit:

Removing TLS-listener on Gateway: TLSRoute endpoint still responds, HTTPRoute doesn't. Next deleting the TLSRoute: TLSRoute endpoint stops responding (endpoint presents the wildcard certificate), HTTPRoute endpoint finally starts working!

The TLSRoute appears to work without a HTTPS-listener (which is only supposed to accept HTTPRoutes) and "blocks" the HTTPRoute.

The commit-history of the above testing can be found here.

vehagn avatar May 09 '24 23:05 vehagn

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close. /lifecycle stale

cert-manager-bot avatar Oct 14 '24 09:10 cert-manager-bot