cilium-cli icon indicating copy to clipboard operation
cilium-cli copied to clipboard

connectitivity test fails consistently in working perfectly native-routing/DSR configuration

Open jorhett opened this issue 1 year ago â€ĸ 1 comments

Bug report

In a properly configured, deployed, and 100% operational native-routing with direct return (no SNAT) the connectivity test consistently fails 6 tests / 15 actions

General Information

  • Cilium CLI version:
$ cilium version -n cilium
cilium-cli: v0.16.7 compiled with go1.22.2 on linux/amd64
cilium image (default): v1.15.4
cilium image (stable): v1.15.4
cilium image (running): 1.15.4
  • Orchestration system version in use (e.g. kubectl version, ...)
$ kubectl version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
  • Platform / infrastructure information: hardware box running Ubuntu jammy
  • Link to relevant artifacts (policies, deployments scripts, ...)
  • Generate and upload a system zip: cilium sysdump

How to reproduce the issue

  1. Configure according to the instructions at https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/
  2. Run cilium connectivity test

Results:

📋 Test Report
❌ 6/45 tests failed (15/193 actions), 33 tests skipped, 0 scenarios skipped:
Test [echo-ingress-l7]:
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-1-public: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-public (10.249.0.7:8080)
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-1-private: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-private (10.249.0.7:8080)
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-1-privatewith-header: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-privatewith-header (10.249.0.7:8080)
Test [echo-ingress-l7-named-port]:
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-0-public: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-0-public (10.249.0.7:8080)
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-0-private: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-0-private (10.249.0.7:8080)
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-0-privatewith-header: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-0-privatewith-header (10.249.0.7:8080)
Test [client-egress-l7-method]:
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-public: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-public (10.249.0.7:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-private: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-private (10.249.0.7:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-privatewith-header: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> curl-ipv4-1-privatewith-header (10.249.0.7:8080)
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-pod/curl-ipv4-1: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> cilium-test/echo-same-node-7f896b84-vckjk (10.249.0.7:8080)
  ❌ client-egress-l7/pod-to-world/http-to-one.one.one.one.-1: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> one.one.one.one.-http (one.one.one.one.:80)
Test [client-egress-l7-named-port]:
  ❌ client-egress-l7-named-port/pod-to-pod/curl-ipv4-1: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> cilium-test/echo-same-node-7f896b84-vckjk (10.249.0.7:8080)
  ❌ client-egress-l7-named-port/pod-to-world/http-to-one.one.one.one.-0: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> one.one.one.one.-http (one.one.one.one.:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one.one.one.one.-0: cilium-test/client-69748f45d8-w7vhl (10.249.0.194) -> one.one.one.one.-http (one.one.one.one.:80)
  ❌ to-fqdns/pod-to-world/http-to-one.one.one.one.-1: cilium-test/client2-ccd7b8bdf-j94zw (10.249.0.217) -> one.one.one.one.-http (one.one.one.one.:80)
connectivity test failed: 6 tests failed

Observations

  • All of these tests work successfully when not running the test (when the special networkpolicies aren't applied)
  • Leaving the tests running shows them start to fail immediately after

â„šī¸ 📜 Applying CiliumNetworkPolicy 'echo-ingress-l7-http' to namespace 'cilium-test'..

and continue to fail until right after

â„šī¸ 📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http-named-port' from namespace 'cilium-test'..

Suggestion

I haven't taken a deep look at the policies (created this to raise awareness rather than sitting on it until I have time) but it appears these are all intended to test routing for snat scenarios?

jorhett avatar May 09 '24 00:05 jorhett

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar Sep 28 '24 01:09 github-actions[bot]

This issue has not seen any activity since it was marked stale. Closing.

github-actions[bot] avatar Oct 13 '24 02:10 github-actions[bot]