Gateway: Combining HTTPS listener with TLS-termination and TLS listener with TLS-passthrough
Describe the bug:
I'm trying to create a Gateway where I use both a HTTPS listener with a certificate provided by Cert-manager, and a TLS listener with TLS-passthrough.
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: test
namespace: gateway
annotations:
cert-manager.io/issuer: cloudflare-issuer
spec:
gatewayClassName: cilium
listeners:
- protocol: HTTPS
port: 443
name: https-gateway
hostname: "*.example.com"
tls:
certificateRefs:
- kind: Secret
name: test-cert
allowedRoutes:
namespaces:
from: All
- protocol: TLS
port: 443
name: proxmox-tls-passthrough
hostname: "proxmox.example.com"
tls:
mode: Passthrough
allowedRoutes:
namespaces:
from: All
When I add the TLS listener the Gateway becomes unresponsive for all HTTPRoutes and TLSRoutes connected to it.
The event log for the the Gateway states:
Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]
Expected behaviour:
I expect the Gateway to work with both listeners. Cert-manager should allow/ignore the TLS listener running in Passthrough mode.
Steps to reproduce the bug:
Create the above Gateway.
Anything else we need to know?:
Cilium 1.15.1 provides the GatewayClass. I initially believed this to be a Cilium-issue, but with further investigation it looks to be an issue with Cert-manager.
A workaround is to create two Gateways, each with their own listener. Alternatively route the HTTPS listener Gateway Service through to the TLS listener Gateway to only expose one LoadBalancer IP.
Environment details::
- Kubernetes version: 1.29.3
- Cloud-provider/provisioner: Bare metal
- cert-manager version: 1.14.4
- Install method: Kustomize + Helm
# kustomization.yaml
helmCharts:
- name: cert-manager
repo: https://charts.jetstack.io
version: 1.14.4
includeCRDs: true
releaseName: cert-manager
namespace: cert-manager
valuesFile: values.yaml
# values.yaml
installCRDs: true
config:
apiVersion: controller.config.cert-manager.io/v1alpha1
kind: ControllerConfiguration
featureGates:
ExperimentalGatewayAPISupport: true
/kind bug
Hey @vehagn thanks for raising. Where does this log come from?
Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]
Does the TLS listener prevent the Certificate being created for the HTTPS listener in the single file example?
Thanks for picking up the issue @hawksight,
The log comes from the events when you run kubectl describe on the Gateway resource.
I’m unsure of what you mean by “single file example.” I think the certificate is successfully created for the HTTPS-listener, it could be from before adding the TLS-listener (I can double check later).
My main issue is that HTTPRoutes connected to the Gateway becomes unresponsive, meaning that I can’t access the Services behind them.
I tried to solve the issue in the linked PR, but haven’t built an image of the branch and tested it fully.
"single file example" I meant this YAML:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: test
namespace: gateway
annotations:
cert-manager.io/issuer: cloudflare-issuer
spec:
gatewayClassName: cilium
listeners:
- protocol: HTTPS
port: 443
name: https-gateway
hostname: "*.example.com"
tls:
certificateRefs:
- kind: Secret
name: test-cert
allowedRoutes:
namespaces:
from: All
- protocol: TLS
port: 443
name: proxmox-tls-passthrough
hostname: "proxmox.example.com"
tls:
mode: Passthrough
allowedRoutes:
namespaces:
from: All
I was trying to understand if cert-manager was actually preventing the Gateway from working, or if that is a Gateway concern. I'll look at the PR more closely to understand the change.
Can you share how you have your Gateway installed? Is it via the standard YAML & CRDs or via a particular project that implements the GatewayAPI?
Thanks for the clarification @hawksight.
I've done some more testing which I try to explain in detail below.
The testing leads me to believe the connectivity issues might be linked to Cilium Issue #32371. Though I still think the BadConfig Warning message I've attempted to fix in PR #6986 for TLSRoutes in Passthrough mode is an improvement.
I'm fetching the Gateway CRDs from https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/experimental-install.yaml. I'm using the experimental install since I want to use the Gateway spec.infrastructure.annotations field to explicitly set the Gateway Service IP. I've omitted this field in the test Gateway.
The full configuration can be found at https://gitlab.com/vehagn/mini-homelab
Doing a new test I'm first creating a Gateway without the Cert-manager annotation
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: test
namespace: gateway
# annotations:
# cert-manager.io/issuer: cloudflare-issuer
spec:
gatewayClassName: cilium
listeners:
- protocol: HTTPS
port: 443
name: https-gateway
hostname: "*.stonegarden.dev"
tls:
certificateRefs:
- kind: Secret
name: test-cert
allowedRoutes:
namespaces:
from: All
- protocol: TLS
port: 443
name: proxmox-tls-passthrough
hostname: "proxmox-test.euclid.stonegarden.dev"
tls:
mode: Passthrough
allowedRoutes:
namespaces:
from: All
and a TLSRoute
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
name: test
namespace: proxmox
spec:
parentRefs:
- name: test
namespace: gateway
hostnames:
- "proxmox-test.euclid.stonegarden.dev"
rules:
- backendRefs:
- name: proxmox-euclid
port: 443
I'm now able to reach proxmox-test.euclid.stonegarden.dev through the Gateway.
Next I add the Cert-manager annotation (uncomment the above Gateway)
annotations:
cert-manager.io/issuer: cloudflare-issuer
Running kubectl describe on the Gateway I now get
❯ kubectl -n gateway describe gateway test
Name: test
Namespace: gateway
Labels: argocd.argoproj.io/instance=gateway
Annotations: argocd.argoproj.io/tracking-id: gateway:gateway.networking.k8s.io/Gateway:gateway/test
cert-manager.io/issuer: cloudflare-issuer
API Version: gateway.networking.k8s.io/v1
Kind: Gateway
Metadata:
Creation Timestamp: 2024-05-09T09:08:12Z
Generation: 2
Resource Version: 8865763
UID: 8437c2e5-b7e1-4d71-b5d5-15995fe4faa5
Spec:
Gateway Class Name: cilium
Listeners:
Allowed Routes:
Namespaces:
From: All
Hostname: *.stonegarden.dev
Name: https-gateway
Port: 443
Protocol: HTTPS
Tls:
Certificate Refs:
Group:
Kind: Secret
Name: test-cert
Mode: Terminate
Allowed Routes:
Namespaces:
From: All
Hostname: proxmox.euclid.stonegarden.dev
Name: proxmox-tls-passthrough
Port: 443
Protocol: TLS
Tls:
Mode: Passthrough
Status:
Addresses:
Type: IPAddress
Value: 192.168.1.221
Conditions:
Last Transition Time: 2024-05-09T09:13:15Z
Message: Gateway successfully scheduled
Observed Generation: 2
Reason: Accepted
Status: True
Type: Accepted
Last Transition Time: 2024-05-09T09:13:15Z
Message: Gateway successfully reconciled
Observed Generation: 2
Reason: Programmed
Status: True
Type: Programmed
Listeners:
Attached Routes: 1
Conditions:
Last Transition Time: 2024-05-09T22:00:24Z
Message: Listener Programmed
Observed Generation: 2
Reason: Programmed
Status: True
Type: Programmed
Last Transition Time: 2024-05-09T22:00:24Z
Message: Listener Accepted
Observed Generation: 2
Reason: Accepted
Status: True
Type: Accepted
Last Transition Time: 2024-05-09T22:00:24Z
Message: Resolved Refs
Reason: ResolvedRefs
Status: True
Type: ResolvedRefs
Name: https-gateway
Supported Kinds:
Group: gateway.networking.k8s.io
Kind: HTTPRoute
Attached Routes: 1
Conditions:
Last Transition Time: 2024-05-09T22:00:24Z
Message: Listener Programmed
Observed Generation: 2
Reason: Programmed
Status: True
Type: Programmed
Last Transition Time: 2024-05-09T22:00:24Z
Message: Listener Accepted
Observed Generation: 2
Reason: Accepted
Status: True
Type: Accepted
Last Transition Time: 2024-05-09T22:00:24Z
Message: Resolved Refs
Reason: ResolvedRefs
Status: True
Type: ResolvedRefs
Name: proxmox-tls-passthrough
Supported Kinds:
Group: gateway.networking.k8s.io
Kind: TLSRoute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreateCertificate 66s cert-manager-gateway-shim Successfully created Certificate "test-cert"
Warning BadConfig 54s (x9 over 66s) cert-manager-gateway-shim Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]
I can still access proxmox-test.euclid.stonegarden.dev and I see that the certificate is created successfully.
Interestingly both listeners report one attached route.
The I comment out the Cert-manager annotation again
# annotations:
# cert-manager.io/issuer: cloudflare-issuer
and create a HTTPRoute for the Gateway.
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: http-route
namespace: whoami
spec:
parentRefs:
- name: test
namespace: gateway
hostnames:
- "https-test.stonegarden.dev"
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: whoami
port: 80
I now get ERR_CONNECTION_RESET when trying to access https-test.stonegarden.dev. The TLSRoute endpoint proxmox-test.euclid.stonegarden.dev still works.
The HTTPRoute status indicates that it should work.
status:
parents:
- conditions:
- lastTransitionTime: '2024-05-09T22:28:45Z'
message: Accepted HTTPRoute
observedGeneration: 2
reason: Accepted
status: 'True'
type: Accepted
- lastTransitionTime: '2024-05-09T22:28:45Z'
message: Service reference is valid
observedGeneration: 2
reason: ResolvedRefs
status: 'True'
type: ResolvedRefs
controllerName: io.cilium/gateway-controller
parentRef:
group: gateway.networking.k8s.io
kind: Gateway
name: test
namespace: gateway
and the Gateway report two routes attached to the HTTPS-listener
status:
addresses:
- type: IPAddress
value: 192.168.1.221
conditions:
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Gateway successfully scheduled
observedGeneration: 11
reason: Accepted
status: 'True'
type: Accepted
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Gateway successfully reconciled
observedGeneration: 11
reason: Programmed
status: 'True'
type: Programmed
listeners:
- attachedRoutes: 2
conditions:
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Listener Programmed
observedGeneration: 11
reason: Programmed
status: 'True'
type: Programmed
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Listener Accepted
observedGeneration: 11
reason: Accepted
status: 'True'
type: Accepted
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Resolved Refs
reason: ResolvedRefs
status: 'True'
type: ResolvedRefs
name: https-gateway
supportedKinds:
- group: gateway.networking.k8s.io
kind: HTTPRoute
- attachedRoutes: 1
conditions:
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Listener Programmed
observedGeneration: 11
reason: Programmed
status: 'True'
type: Programmed
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Listener Accepted
observedGeneration: 11
reason: Accepted
status: 'True'
type: Accepted
- lastTransitionTime: '2024-05-09T22:28:36Z'
message: Resolved Refs
reason: ResolvedRefs
status: 'True'
type: ResolvedRefs
name: proxmox-tls-passthrough
supportedKinds:
- group: gateway.networking.k8s.io
kind: TLSRoute
Then I uncomment the Cert-manager annotation again
annotations:
cert-manager.io/issuer: cloudflare-issuer
And I can still connect to the TLSRoute endpoint, but not the HTTPRoute endpoint.
Deleting the Gateway and waiting for Argo CD to recreate it the TLSRoute endpoint also stops working.
Deleting the Gateway and waiting for Argo CD to recreate again it the TLSRoute endpoint now works again.
Deleting and recreating the gateway appears to continue this flip-flop pattern.
Cert-manager diligently reattaches the certificate it created earlier each time.
Edit:
Removing TLS-listener on Gateway: TLSRoute endpoint still responds, HTTPRoute doesn't. Next deleting the TLSRoute: TLSRoute endpoint stops responding (endpoint presents the wildcard certificate), HTTPRoute endpoint finally starts working!
The TLSRoute appears to work without a HTTPS-listener (which is only supposed to accept HTTPRoutes) and "blocks" the HTTPRoute.
The commit-history of the above testing can be found here.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
/lifecycle stale