cilium
cilium copied to clipboard
envoy: Avoid short circuit backend filtering
Description
The same service can be used with multiple port types (e.g number and name), so we should continue matching port values for both.
Reported-by: Philip Schmid [email protected]
Testing
The below manifest is used for testing (before and after the changes).
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: echo
spec:
ingressClassName: cilium
rules:
- http:
paths:
- backend:
service:
name: echo
port:
name: http
path: /named
pathType: Prefix
- http:
paths:
- backend:
service:
name: echo
port:
number: 80
path: /number
pathType: Prefix
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: server
name: server
spec:
replicas: 1
selector:
matchLabels:
app: server
template:
metadata:
labels:
app: server
spec:
containers:
- env:
- name: NODE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: PRINT_HTTP_REQUEST_HEADERS
value: "true"
- name: PORT
value: "80"
image: ghcr.io/philipschmid/echo-app:v0.4
imagePullPolicy: IfNotPresent
name: server
ports:
- containerPort: 80
name: http
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: echo
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: server
Before the changes, traffic to /named is returning 503
$ lb=$(kubectl get ingress echo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ curl -v http://$lb/named
* Trying 10.98.35.245:80...
* Connected to 10.98.35.245 (10.98.35.245) port 80
> GET /named HTTP/1.1
> Host: 10.98.35.245
> User-Agent: curl/8.8.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 503 Service Unavailable
< content-length: 19
< content-type: text/plain
< date: Wed, 26 Jun 2024 10:10:28 GMT
< server: envoy
<
* Connection #0 to host 10.98.35.245 left intact
no healthy upstream%
After the changes
$ lb=$(kubectl get ingress echo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ curl -v http://$lb/number
* Trying 10.96.175.210:80...
* Connected to 10.96.175.210 (10.96.175.210) port 80
> GET /number HTTP/1.1
> Host: 10.96.175.210
> User-Agent: curl/8.8.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< content-type: application/json
< date: Wed, 26 Jun 2024 10:27:42 GMT
< content-length: 327
< x-envoy-upstream-service-time: 0
< server: envoy
<
{"timestamp":"2024-06-26T10:27:42.213Z","source_ip":"10.244.0.93","hostname":"server-866c4fc4c7-g5l7n","node":"minikube","headers":{"Accept":["*/*"],"User-Agent":["curl/8.8.0"],"X-Envoy-Internal":["true"],"X-Forwarded-For":["192.168.49.1"],"X-Forwarded-Proto":["http"],"X-Request-Id":["87612d4e-b85e-4daf-b1d4-3b50936913c9"]}}
* Connection #0 to host 10.96.175.210 left intact
$ curl -v http://$lb/named
* Trying 10.96.175.210:80...
* Connected to 10.96.175.210 (10.96.175.210) port 80
> GET /named HTTP/1.1
> Host: 10.96.175.210
> User-Agent: curl/8.8.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< content-type: application/json
< date: Wed, 26 Jun 2024 10:27:45 GMT
< content-length: 327
< x-envoy-upstream-service-time: 0
< server: envoy
<
{"timestamp":"2024-06-26T10:27:45.681Z","source_ip":"10.244.0.93","hostname":"server-866c4fc4c7-g5l7n","node":"minikube","headers":{"Accept":["*/*"],"User-Agent":["curl/8.8.0"],"X-Envoy-Internal":["true"],"X-Forwarded-For":["192.168.49.1"],"X-Forwarded-Proto":["http"],"X-Request-Id":["4bc965fa-6399-4b34-8af1-3d5f3946d729"]}}
* Connection #0 to host 10.96.175.210 left intact
/test
LGTM 🎉. I did a manual E2E test on an RKE2 cluster with the following values:
kubeProxyReplacement: "true"
# Let's use the RKE2 internal TCP LB which points to all CP nodes:
k8sServiceHost: 127.0.0.1
k8sServicePort: 6443
hubble:
relay:
enabled: true
operator:
replicas: 1
image:
override: quay.io/cilium/operator-generic-ci:229a9da1935c6a1e9b2b070dc0afd21e4b3e0f67@sha256:65383f4cb78697932a174f5eefecc95648ba5615a7a64d07ebdd4485d430cadc
ingressController:
enabled: true
loadbalancerMode: dedicated
ingressLBAnnotationPrefixes: ['service.beta.kubernetes.io', 'service.kubernetes.io', 'cloud.google.com', 'io.cilium/lb-ipam-ips']
service:
annotations:
"io.cilium/lb-ipam-ips": "10.10.10.1"
bgpControlPlane:
enabled: true
image:
override: quay.io/cilium/cilium-ci:229a9da1935c6a1e9b2b070dc0afd21e4b3e0f67@sha256:5d648a69443b89d34991b437971cb1db63d877c58aad69c13ccfc62226ab143a
envoy:
enabled: false
Works for both of the following two Ingress resources while it didn't work before with the v1.16.0-rc.0 image for the echo-shared with the named port:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: echo-dedicated
namespace: demo
annotations:
ingress.cilium.io/loadbalancer-mode: dedicated
"io.cilium/lb-ipam-ips": 10.10.10.2"
spec:
ingressClassName: cilium
rules:
- host: 10.10.10.2.sslip.io
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: echo
port:
number: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: echo-shared
namespace: demo
annotations:
ingress.cilium.io/loadbalancer-mode: shared
spec:
ingressClassName: cilium
rules:
- host: 10.10.10.1.sslip.io
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: echo
port:
name: http
/test