cilium
cilium copied to clipboard
Incorrect destination IP in L7 ingress flows with L7 egress and ingress policy configured for same port
Is there an existing issue for this?
- [X] I have searched the existing issues
What happened?
When configuring both ingress and egress L7 HTTP policies that target the same port, the destination IP is the cilium node IP, causing the pod_name in the destination to be missing. The L7 egress flows seem to still have the correct source/destination IPs for the pods.
I suspect this is related to the fact that the packet is going through envoy which will appear as the cilium node.
I've been able to reliably produce this on a single node KIND setup locally. If I disable the egress policy, the destination IP is correct, similarly, if I disable the ingress policy, the IPs remain correct, it's only an issue when both egress and ingress policies are enabled for the same port. I also tried variations of enabling the ingress policy first, then the egress policy, as well as the opposite: enabling the egress policy first, then enabling the ingress policy. Ordering does not seem to matter, and it still only effects the ingress flows destination IP, and only when both L7 ingress/egress policies are enabled.
Cilium Version
Client: 1.12.1 4c9a630 2022-08-15T16:29:39-07:00 go version go1.18.5 linux/arm64
Kernel Version
Linux lima-docker 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:08:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
Kubernetes Version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:42:59Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
Sysdump
cilium-sysdump-20220826-165646.zip
Relevant log output
# 10.0.0.196 is the cilium node IP, 10.0.0.25 is the client app (crawler) and 10.0.0.146 is the server app (loader)
# notice how ingress flows have the destination as the cilium node IP, but egress flows have the correct values.
(⎈|kind-kind:default) ~/p/w/kind-cilium-ce-helm-install ❯❯❯ hubble observe -t l7 -n tenant-jobs --protocol http -o jsonpb | grep EGRESS | jq '.flow | {ip: .IP, source_pod: .source.pod_name, dest_pod: .destination.pod_name}' -rc main ⬆ ✭ ✱ ◼
{"ip":{"source":"10.0.0.25","destination":"10.0.0.146","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":"crawler-86b7cbc87-fbqzc"}
{"ip":{"source":"10.0.0.146","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":"crawler-86b7cbc87-fbqzc","dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.146","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":"crawler-86b7cbc87-fbqzc","dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.146","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":"crawler-86b7cbc87-fbqzc"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.146","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":"crawler-86b7cbc87-fbqzc"}
{"ip":{"source":"10.0.0.146","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":"crawler-86b7cbc87-fbqzc","dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.146","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":"crawler-86b7cbc87-fbqzc"}
{"ip":{"source":"10.0.0.146","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":"crawler-86b7cbc87-fbqzc","dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.146","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":"crawler-86b7cbc87-fbqzc"}
{"ip":{"source":"10.0.0.146","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":"crawler-86b7cbc87-fbqzc","dest_pod":"loader-597c4c8b54-8vn84"}
(⎈|kind-kind:default) ~/p/w/kind-cilium-ce-helm-install ❯❯❯ hubble observe -t l7 -n tenant-jobs --protocol http -o jsonpb | grep INGRESS | jq '.flow | {ip: .IP, source_pod: .source.pod_name, dest_pod: .destination.pod_name}' -rc main ⬆ ✭ ✱ ◼
{"ip":{"source":"10.0.0.25","destination":"10.0.0.196","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":null}
{"ip":{"source":"10.0.0.196","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":null,"dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.196","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":null}
{"ip":{"source":"10.0.0.196","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":null,"dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.196","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":null}
{"ip":{"source":"10.0.0.196","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":null,"dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.196","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":null}
{"ip":{"source":"10.0.0.196","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":null,"dest_pod":"loader-597c4c8b54-8vn84"}
{"ip":{"source":"10.0.0.25","destination":"10.0.0.196","ipVersion":"IPv4"},"source_pod":"loader-597c4c8b54-8vn84","dest_pod":null}
{"ip":{"source":"10.0.0.196","destination":"10.0.0.25","ipVersion":"IPv4"},"source_pod":null,"dest_pod":"loader-597c4c8b54-8vn84"}
Anything else?
L7 policies in question: https://gist.github.com/chancez/d717d68e329c4997fbdf896cdddf15ce
This is using the jobs-app for cilium enterprise demos (note, the demo app has these policies but without the egress L7 policy on the ports, so you'll need to apply the manifest from the gist after installing):
helm install jobs-app isovalent/jobs-app --version v0.2.0 --namespace tenant-jobs
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
I recall we had trouble with overlapping 5-tuples at least in the case where both egress and ingress policies are in place and both source and destination are on the same node. For this reason, IIRC:
-
Ingress proxy always uses the node IP as the source address when forwarding to the destination pod. This does not affect the exported (hubble) flow information, as that is based on the incoming 5-tuple.
-
Egress proxy uses the node IP as the source if the destination is on the same node. This does not affect the exported flow entry exported by the egress proxy as the 5-tuple is from the incoming connection. L7 LB takes the destination address from the upstream connection, but keeps the original source address from the incoming (downstream) connection. This WILL affect the flow exported at ingress, both at L3 and L7 levels, as the incoming source address is the node IP. Source security ID is always carried through within the node, so that is not affected.
We have NOTRACK rules from proxy redirection connections, so that connection tracking is not the issue here, but the host networking namespace would not be able to differentiate between the connections to the egress and ingress listeners if they both had the same 5-tuple.
Since this happens only for connections on the same node, we could probably paper over this in Cilium agent, if so desired. It could be a bit confusing if hubble flows report 5-tuples that don’t really match the network traffic (or the L3/4 flow entries), but this already happens with L7 LB flows as described above.
But it is actually weird that the destination IP is the node IP, unless that is for a return packet for a connection for which the source address was the node IP. We always preserve the original destination IP address/port when forwarding traffic through Envoy, as otherwise the destination node would not know to which pod the traffic should go to. Having port numbers visible with the IP addresses would reveal the direction (original/reply) of the flows.
So I was doing some more testing, and it does seem like this is actually not limited to destination IP as I thought. And it's actually not consistently broken either it seems. I applied the following policy: https://gist.github.com/chancez/d6e08d2f0ae8e513f3f28144958bd9ea
And discovered some ingress flows with incorrect source, and some missing dest. In this output, my cilium node IP is 10.0.0.69. I also included source/dest ports, and the l7 type in case that helps.
(⎈|kind-kind:default) ~/p/w/kind-cilium-ce-helm-install ❯❯❯ hubble observe -t l7 -n tenant-jobs --protocol http -o jsonpb | grep INGRESS | jq '.flow | {type: .l7.type, direction: .traffic_direction, source_pod: .source.pod_name, dest_pod: .destination.pod_name, source: "\(.IP.source):\(.l4.TCP.source_port)", dest: "\(.IP.destination):\(.l4.TCP.destination_port)"}' -rc
{"type":"REQUEST","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":"elasticsearch-master-0","source":"10.0.0.50:42420","dest":"10.0.0.72:9200"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"elasticsearch-master-0","dest_pod":"coreapi-546797cd76-8m28l","source":"10.0.0.72:9200","dest":"10.0.0.50:42420"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":null,"source":"10.0.0.50:9080","dest":"10.0.0.69:57714"}
{"type":"REQUEST","direction":"INGRESS","source_pod":"crawler-86b7cbc87-p27dk","dest_pod":"loader-597c4c8b54-8mv9s","source":"10.0.0.126:48382","dest":"10.0.0.151:50051"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"loader-597c4c8b54-8mv9s","dest_pod":"crawler-86b7cbc87-p27dk","source":"10.0.0.151:50051","dest":"10.0.0.126:48382"}
{"type":"REQUEST","direction":"INGRESS","source_pod":null,"dest_pod":"coreapi-546797cd76-8m28l","source":"10.0.0.69:57716","dest":"10.0.0.50:9080"}
{"type":"REQUEST","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":"elasticsearch-master-0","source":"10.0.0.50:42420","dest":"10.0.0.72:9200"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"elasticsearch-master-0","dest_pod":"coreapi-546797cd76-8m28l","source":"10.0.0.72:9200","dest":"10.0.0.50:42420"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":null,"source":"10.0.0.50:9080","dest":"10.0.0.69:57716"}
{"type":"REQUEST","direction":"INGRESS","source_pod":"crawler-86b7cbc87-p27dk","dest_pod":"loader-597c4c8b54-8mv9s","source":"10.0.0.126:48382","dest":"10.0.0.151:50051"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"loader-597c4c8b54-8mv9s","dest_pod":"crawler-86b7cbc87-p27dk","source":"10.0.0.151:50051","dest":"10.0.0.126:48382"}
{"type":"REQUEST","direction":"INGRESS","source_pod":null,"dest_pod":"coreapi-546797cd76-8m28l","source":"10.0.0.69:57718","dest":"10.0.0.50:9080"}
{"type":"REQUEST","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":"elasticsearch-master-0","source":"10.0.0.50:42420","dest":"10.0.0.72:9200"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"elasticsearch-master-0","dest_pod":"coreapi-546797cd76-8m28l","source":"10.0.0.72:9200","dest":"10.0.0.50:42420"}
{"type":"RESPONSE","direction":"INGRESS","source_pod":"coreapi-546797cd76-8m28l","dest_pod":null,"source":"10.0.0.50:9080","dest":"10.0.0.69:57718"}
I couldn't find any egress flows with this issue in this case though, so it seems like it's still limited to egress, but perhaps not only to the destination, but the source as well.
In the above logs destination IP is only ever the node IP for (ingress) response packets, which makes sense since the egress visibility policy redirects to Envoy, and uses the node IP as the source due to the destination being in the same node and 5-tuples arriving to egress and ingress proxies need to be different.
Policy enforcement is not affected as in this case we can always carry the source security ID to the destination.
The reason behind this behavior makes sense, though it breaks the metrics labels like source/destination pod because the pod name is missing, and obviously the IP being "wrong" can be confusing. At least it's limited to a specific circumstance.
@chancez based on the discussion above do you there's a good solution that we can implement?
I don't know enough about how packets traverse cilium to have any suggestions on solutions, I think that's something @jrajahalme might be able to answer better.
Policy enforcement is not affected as in this case we can always carry the source security ID to the destination.
Also, this is correct, but it does break filtering for hubble flows since the IP is not what the user would expect.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.