cilium-cli icon indicating copy to clipboard operation
cilium-cli copied to clipboard

cilium connectivity test failures

Open jerrac opened this issue 4 years ago • 27 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

What happened?

When running cilium connectivity test, several tests are failing. no-policies, client-egress-l7, and to-fqdns. They all fail due to curl exiting with a code 28.

The thing is, when I run a curlimages/curl container and run the commands listed in the test output (removing the -w, and --output flags), it downloads the page just fine.

This is on a new Kubernetes 1.23 cluster. Cilium was the first thing I installed. Ubuntu 20.04 is the host OS. It is running FirewallD for a host level firewall.

Is this something I should try to fix, or can I ignore these tests?

Cilium Version

cilium-cli: v0.10.0 compiled with go1.17.4 on linux/amd64 cilium image (default): v1.11.0 cilium image (stable): v1.11.0 cilium image (running): v1.11.0

Kernel Version

Linux controlplane 5.4.0-91-generic cilium/cilium#102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:41:01Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:34:54Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

No response

Relevant log output

root@controlplane:~# cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
⌛ [clustername] Waiting for deployments [client client2 echo-same-node] to become ready...
⌛ [clustername] Waiting for deployments [echo-other-node] to become ready...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client-7568bc7f86-2mdgt to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client2-686d5f784b-5llc9 to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-59d779959c-2jggr to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-5767b7b99d-rmcqc to appear...
⌛ [clustername] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [clustername] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:32187 (cilium-test/echo-same-node) to become ready...
ℹ️  Skipping IPCache check
⌛ [clustername] Waiting for pod cilium-test/client-7568bc7f86-2mdgt to reach default/kubernetes service...
⌛ [clustername] Waiting for pod cilium-test/client2-686d5f784b-5llc9 to reach default/kubernetes service...
🔭 Enabling Hubble telescope...
⚠️  Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:4245: connect: connection refused"
ℹ️  Expose Relay locally with:
   cilium hubble enable
   cilium hubble port-forward&
🏃 Running tests...

[=] Test [no-policies]
................................................
[=] Test [allow-all]
............................................
[=] Test [client-ingress]
..
[=] Test [echo-ingress]
....
[=] Test [client-egress]
....
[=] Test [to-entities-world]
......
[=] Test [to-cidr-1111]
....
[=] Test [echo-ingress-l7]
....
[=] Test [client-egress-l7]
........
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-only-dns' to namespace 'cilium-test'..
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-l7-http' to namespace 'cilium-test'..
  [-] Scenario [client-egress-l7/pod-to-pod]
  [.] Action [client-egress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> cilium-test/echo-same-node-5767b7b99d-rmcqc (nnn.nnn.3.159:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.193:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-2: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.193:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-3: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> cilium-test/echo-same-node-5767b7b99d-rmcqc (nnn.nnn.3.159:8080)]
  [-] Scenario [client-egress-l7/pod-to-world]
  [.] Action [client-egress-l7/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-http (one.one.one.one:80)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-0: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-index-0: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-https-index (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-1
  📄 No flows recorded during action http-to-one-one-one-one-1
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-1: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-index-1: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-https-index (one.one.one.one:443)]
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-only-dns' from namespace 'cilium-test'..
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http' from namespace 'cilium-test'..

[=] Test [dns-only]
..........
[=] Test [to-fqdns]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' to namespace 'cilium-test'..
  [-] Scenario [to-fqdns/pod-to-world]
  [.] Action [to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-0
  📄 No flows recorded during action http-to-one-one-one-one-0
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-0: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-index-0: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-https-index (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-1
  📄 No flows recorded during action http-to-one-one-one-one-1
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-index-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-https-index (one.one.one.one:443)]
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' from namespace 'cilium-test'..

📋 Test Report
❌ 2/11 tests failed (3/142 actions), 0 tests skipped, 0 scenarios skipped:
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-http (one.one.one.one:80)
Connectivity test failed: 2 tests failed

Anything else?

This issue https://github.com/cilium/cilium/issues/18273 has a similar failing test, but only mentions one of the tests failing. Not both, like in my result.

I reran the connectivity test today, about two weeks after my initial run (and slack post). Same result. That said, the few things I did start running on my cluster all seem to be working fine.

I do have the sysdump file, but I'm reluctant to upload it until I confirm it doesn't contain anything I don't want it to contain. If it's really needed, let me know.

Thanks in advance!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

jerrac avatar Jan 04 '22 18:01 jerrac

Does it fail consistently?

christarazi avatar Jan 05 '22 07:01 christarazi

Does it fail consistently?

Yes. I get the same output every time I run cilium connectivity test.

jerrac avatar Jan 05 '22 16:01 jerrac

Hm, that sounds like a cilium-cli issue to me rather than a Cilium issue, given that you can run the commands manually successfully.

@tklauser Should we transfer this issue to the cilium-cli repo?

christarazi avatar Jan 05 '22 19:01 christarazi

~@christarazi I wonder what makes you think that the issue must be a cilium-cli issue?~

EDIT: I follow the logic now, just had to let it settle in my head :sweat_smile:

One other thing that can help to identify what went wrong is to follow these instructions from the CLI:

⚠️  Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:4245: connect: connection refused"
ℹ️  Expose Relay locally with:
   cilium hubble enable
   cilium hubble port-forward&

This way, Cilium and Hubble can provide more details about exactly what happens to packets when you run the connectivity test. Overall the structure of the output will look very similar, but there will be more details that we can look at.

joestringer avatar Jan 06 '22 00:01 joestringer

@tklauser Should we transfer this issue to the cilium-cli repo?

👍 let's transfer it there.

tklauser avatar Jan 06 '22 13:01 tklauser

Thanks for the help so far. :)

After installing hubble, 11/11 tests fail...

Here is my output:

Installed hubble:

root@controlplane:~# cilium hubble enable
🔑 Found CA in secret cilium-ca
✨ Patching ConfigMap cilium-config to enable Hubble...
♻️  Restarted Cilium pods
⌛ Waiting for Cilium to become ready before deploying other Hubble component(s)...
🔑 Generating certificates for Relay...
✨ Deploying Relay from quay.io/cilium/hubble-relay:v1.11.0...
⌛ Waiting for Hubble to be installed...
✅ Hubble was successfully enabled!

root@controlplane:~# cilium hubble port-forward&
[1] 410893

Ran connectivity test again:

root@controlplane:~# cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
⌛ [clustername] Waiting for deployments [client client2 echo-same-node] to become ready...
⌛ [clustername] Waiting for deployments [echo-other-node] to become ready...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client-7568bc7f86-5c28z to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client2-686d5f784b-nb6j6 to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-59d779959c-2jggr to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-5767b7b99d-dmdj4 to appear...
⌛ [clustername] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [clustername] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:32187 (cilium-test/echo-same-node) to become ready...
ℹ️  Skipping IPCache check
⌛ [clustername] Waiting for pod cilium-test/client-7568bc7f86-5c28z to reach default/kubernetes service...
⌛ [clustername] Waiting for pod cilium-test/client2-686d5f784b-nb6j6 to reach default/kubernetes service...
🔭 Enabling Hubble telescope...
ℹ️  Hubble is OK, flows: 0/0
🏃 Running tests...

[=] Test [no-policies]
.
  [-] Scenario [no-policies/pod-to-cidr]
  [.] Action [no-policies/pod-to-cidr/cloudflare-1001-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212) -> cloudflare-1001 (1.0.0.1:80)]
  🔥 Timeout waiting for flow listener to become ready

[=] Test [allow-all]
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'allow-all' to namespace 'cilium-test'..
  [-] Scenario [allow-all/client-to-client]
  [.] Action [allow-all/client-to-client/ping-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212:0)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'allow-all' from namespace 'cilium-test'..

[=] Test [client-ingress]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-ingress-from-client2' to namespace 'cilium-test'..
  [-] Scenario [client-ingress/client-to-client]
  [.] Action [client-ingress/client-to-client/ping-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212:0)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-ingress-from-client2' from namespace 'cilium-test'..

[=] Test [echo-ingress]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'echo-ingress-from-other-client' to namespace 'cilium-test'..
  [-] Scenario [echo-ingress/pod-to-pod]
  [.] Action [echo-ingress/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'echo-ingress-from-other-client' from namespace 'cilium-test'..

[=] Test [client-egress]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-echo' to namespace 'cilium-test'..
  [-] Scenario [client-egress/pod-to-pod]
  [.] Action [client-egress/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-echo' from namespace 'cilium-test'..

[=] Test [to-entities-world]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-entities-world' to namespace 'cilium-test'..
  [-] Scenario [to-entities-world/pod-to-world]
  [.] Action [to-entities-world/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> one-one-one-one-http (one.one.one.one:80)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-entities-world' from namespace 'cilium-test'..

[=] Test [to-cidr-1111]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-cidr' to namespace 'cilium-test'..
  [-] Scenario [to-cidr-1111/pod-to-cidr]
  [.] Action [to-cidr-1111/pod-to-cidr/cloudflare-1001-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cloudflare-1001 (1.0.0.1:80)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-cidr' from namespace 'cilium-test'..

[=] Test [echo-ingress-l7]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'echo-ingress-l7-http' to namespace 'cilium-test'..
  [-] Scenario [echo-ingress-l7/pod-to-pod]
  [.] Action [echo-ingress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'echo-ingress-l7-http' from namespace 'cilium-test'..

[=] Test [client-egress-l7]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-only-dns' to namespace 'cilium-test'..
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-l7-http' to namespace 'cilium-test'..
  [-] Scenario [client-egress-l7/pod-to-pod]
  [.] Action [client-egress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http' from namespace 'cilium-test'..
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-only-dns' from namespace 'cilium-test'..

[=] Test [dns-only]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-only-dns' to namespace 'cilium-test'..
  [-] Scenario [dns-only/pod-to-pod]
  [.] Action [dns-only/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-only-dns' from namespace 'cilium-test'..

[=] Test [to-fqdns]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' to namespace 'cilium-test'..
  [-] Scenario [to-fqdns/pod-to-world]
  [.] Action [to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> one-one-one-one-http (one.one.one.one:80)]
  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' from namespace 'cilium-test'..

📋 Test Report
❌ 11/11 tests failed (11/11 actions), 0 tests skipped, 0 scenarios skipped:
Test [no-policies]:
  ❌ no-policies/pod-to-cidr/cloudflare-1001-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212) -> cloudflare-1001 (1.0.0.1:80)
Test [allow-all]:
  ❌ allow-all/client-to-client/ping-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212:0)
Test [client-ingress]:
  ❌ client-ingress/client-to-client/ping-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.212:0)
Test [echo-ingress]:
  ❌ echo-ingress/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)
Test [client-egress]:
  ❌ client-egress/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)
Test [to-entities-world]:
  ❌ to-entities-world/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> one-one-one-one-http (one.one.one.one:80)
Test [to-cidr-1111]:
  ❌ to-cidr-1111/pod-to-cidr/cloudflare-1001-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cloudflare-1001 (1.0.0.1:80)
Test [echo-ingress-l7]:
  ❌ echo-ingress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)
Test [dns-only]:
  ❌ dns-only/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> cilium-test/echo-other-node-59d779959c-2jggr (nnn.nnn.4.142:8080)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.168) -> one-one-one-one-http (one.one.one.one:80)
Connectivity test failed: 11 tests failed

If I run a curl container:

$ kubectl run -it --rm test --image=curlimages/curl --restart=Never -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ $ curl 1.0.0.1:80
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center>
<hr><center>cloudflare</center>
</body>
</html>
/ $ 

jerrac avatar Jan 06 '22 16:01 jerrac

Oh, one other thought just occured to me. I'm running firewalld to manage iptables. I opened 8472/udp (from Cilium's docs if I recall correctly). As well as the ports listed here: https://kubernetes.io/docs/reference/ports-and-protocols/

Beyond that, doesn't Cilium just bypass iptables? Or did I miss something in the docs that I need to open?

jerrac avatar Jan 06 '22 16:01 jerrac

It bypasses iptables fully if it has kube-proxy-replacemant=strict. See this doc for more details on that.

christarazi avatar Jan 06 '22 17:01 christarazi

It looks like the test was unable to connect to Hubble, have you opened up Hubble in your firewall as well? https://docs.cilium.io/en/stable/operations/system_requirements/#firewall-rules

joestringer avatar Jan 06 '22 17:01 joestringer

RE: https://github.com/cilium/cilium-cli/issues/673#issuecomment-1006760108

I installed Kubernetes via kubeadmn, then installed Cilium with cilium install --config ipam=kubernetes. From the doc you linked, it sounds like the default is kubeProxyReplacement=probe. Right?

And I do have kube-proxy pods running in my cluster.

So could firewalld be getting in the way?

jerrac avatar Jan 06 '22 17:01 jerrac

In your original report, the failing tests were the following:

❌ 2/11 tests failed (3/142 actions), 0 tests skipped, 0 scenarios skipped:
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-686d5f784b-5llc9 (nnn.nnn.3.88) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client-7568bc7f86-2mdgt (nnn.nnn.3.67) -> one-one-one-one-http (one.one.one.one:80)
Connectivity test failed: 2 tests failed

These are deploying pods with L7 policy applied and then attempting outbound connections. If you deploy a brand new container with different labels, then the L7 policy may not apply to those pods and that could explain a difference between your manual attempts vs. the cilium connectivity test.

In the latest run output you posted, all tests failed, but they all have this kind of error as well:

  🔥 Timeout waiting for flow listener to become ready
  🔥 Receiving flows from Hubble Relay: hubble server status failure: context canceled

I suspect that this means that the actual test was not performed, because the CLI could not successfully establish a connection to Hubble to be able to monitor the output. This is likely caused by the firewall. The reason why your manual command works is likely that your pod is not running in the same namespace and is not affected by the same policies and your manual test does not rely on the ability to connect to Hubble.

Overall I'd say that the simplest answer for why the connectivity check isn't working would be that firewallD is getting in the way, but the main way to confirm that would be to check whether firewallD is dropping packets that are required for this connectivity test.

The motivation I had behind enabling Hubble and then re-running the test is that then Cilium&Hubble could independently tell you what they see when handling the traffic, which could point out whether Cilium explicitly drops any traffic, or whether it hands traffic to the Linux stack and then perhaps the Linux stack doesn't deliver the traffic (for instance, because the firewall drops some of that traffic).

joestringer avatar Jan 06 '22 17:01 joestringer

Ok, I have no idea how I missed that list of ports to open. I thought I had looked for a list like that... Sheesh. I've opened the ports in firewalld and ran a full cluster restart.

Anyway, something is still not working right. It can't connect to the hubble relay. I confirmed with nmap that port 4245 is open, even if nothing seems to be listening on it. Do I need to reinstall Hubble? How would I go about that?

Here is my test output, I do get the same output when I run it multiple times:

root@controlplane:~# cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
⌛ [clustername] Waiting for deployments [client client2 echo-same-node] to become ready...
⌛ [clustername] Waiting for deployments [echo-other-node] to become ready...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client-7568bc7f86-5c28z to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/client2-686d5f784b-nb6j6 to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-59d779959c-wddcv to appear...
⌛ [clustername] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-5767b7b99d-dmdj4 to appear...
⌛ [clustername] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [clustername] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.207:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.9:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.208:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.209:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:31231 (cilium-test/echo-other-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.8:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:32187 (cilium-test/echo-same-node) to become ready...
⌛ [clustername] Waiting for NodePort nnn.nnn.nnn.7:31231 (cilium-test/echo-other-node) to become ready...
ℹ️  Skipping IPCache check
⌛ [clustername] Waiting for pod cilium-test/client-7568bc7f86-5c28z to reach default/kubernetes service...
⌛ [clustername] Waiting for pod cilium-test/client2-686d5f784b-nb6j6 to reach default/kubernetes service...
🔭 Enabling Hubble telescope...
⚠️  Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:4245: connect: connection refused"
ℹ️  Expose Relay locally with:
   cilium hubble enable
   cilium hubble port-forward&
🏃 Running tests...

[=] Test [no-policies]
................................................
[=] Test [allow-all]
............................................
[=] Test [client-ingress]
..
[=] Test [echo-ingress]
....
[=] Test [client-egress]
....
[=] Test [to-entities-world]
......
[=] Test [to-cidr-1111]
....
[=] Test [echo-ingress-l7]
....
[=] Test [client-egress-l7]
........
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-only-dns' to namespace 'cilium-test'..
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-l7-http' to namespace 'cilium-test'..
  [-] Scenario [client-egress-l7/pod-to-pod]
  [.] Action [client-egress-l7/pod-to-pod/curl-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> cilium-test/echo-other-node-59d779959c-wddcv (nnn.nnn.3.234:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-1: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> cilium-test/echo-same-node-5767b7b99d-dmdj4 (nnn.nnn.5.212:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-2: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> cilium-test/echo-same-node-5767b7b99d-dmdj4 (nnn.nnn.5.212:8080)]
  [.] Action [client-egress-l7/pod-to-pod/curl-3: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> cilium-test/echo-other-node-59d779959c-wddcv (nnn.nnn.3.234:8080)]
  [-] Scenario [client-egress-l7/pod-to-world]
  [.] Action [client-egress-l7/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-http (one.one.one.one:80)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-index-0: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-https-index (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5001 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-1
  📄 No flows recorded during action http-to-one-one-one-one-1
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-1: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [client-egress-l7/pod-to-world/https-to-one-one-one-one-index-1: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-https-index (one.one.one.one:443)]
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-only-dns' from namespace 'cilium-test'..
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http' from namespace 'cilium-test'..

[=] Test [dns-only]
..........
[=] Test [to-fqdns]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' to namespace 'cilium-test'..
  [-] Scenario [to-fqdns/pod-to-world]
  [.] Action [to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-0
  📄 No flows recorded during action http-to-one-one-one-one-0
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-index-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-https-index (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-http (one.one.one.one:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command terminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000
  
  📄 No flows recorded during action http-to-one-one-one-one-1
  📄 No flows recorded during action http-to-one-one-one-one-1
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-1: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-https (one.one.one.one:443)]
  [.] Action [to-fqdns/pod-to-world/https-to-one-one-one-one-index-1: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-https-index (one.one.one.one:443)]
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-fqdns-one-one-one-one' from namespace 'cilium-test'..

📋 Test Report
❌ 2/11 tests failed (3/142 actions), 0 tests skipped, 0 scenarios skipped:
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-http (one.one.one.one:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-686d5f784b-nb6j6 (nnn.nnn.5.36) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client-7568bc7f86-5c28z (nnn.nnn.5.57) -> one-one-one-one-http (one.one.one.one:80)
Connectivity test failed: 2 tests failed

jerrac avatar Jan 06 '22 18:01 jerrac

Ok, so I tried following these docs to install a new cluster without kube-proxy and using KubeProxyReplacement: Strict mode. I put this cluster on a couple new Ubuntu 20.04 vms hosted by my laptop. (Specifically using Vagrant boxes from generic/ubuntu2004.)

I also found the note in the requirements about Systemd 245 and needing to override rp_filter. So I applied the config as that note suggested.

End result is the same curl error on the tests.

That said, if I disable firewalld, restart docker and kublet, the tests all succeed.

So I'm definitely missing something in my firewall rules. As far as I can see, I've opened all the ports listed in the system requirements docs for both Kubernetes and Cilium.

I'll share my iptables below. Is there anything missing?

Here are the results of iptables -n --list for my control plane:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
CILIUM_INPUT  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_INPUT */
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED,DNAT
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
INPUT_direct  all  --  0.0.0.0/0            0.0.0.0/0           
INPUT_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0            ctstate INVALID
REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
CILIUM_FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_FORWARD */
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED,DNAT
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_direct  all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_IN_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_OUT_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0            ctstate INVALID
REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
CILIUM_OUTPUT  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_OUTPUT */
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
OUTPUT_direct  all  --  0.0.0.0/0            0.0.0.0/0           

Chain CILIUM_FORWARD (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: any->cluster on cilium_host forward accept */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on cilium_host forward accept (nodeport) */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on lxc+ forward accept */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on cilium_net forward accept (nodeport) */

Chain CILIUM_INPUT (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            mark match 0x200/0xf00 /* cilium: ACCEPT for proxy traffic */

Chain CILIUM_OUTPUT (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            mark match 0xa00/0xfffffeff /* cilium: ACCEPT for proxy return traffic */
MARK       all  --  0.0.0.0/0            0.0.0.0/0            mark match ! 0xe00/0xf00 mark match ! 0xd00/0xf00 mark match ! 0xa00/0xe00 /* cilium: host->any mark as from host */ MARK xset 0xc00/0xf00

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FORWARD_IN_ZONES (1 references)
target     prot opt source               destination         
FWDI_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain FORWARD_OUT_ZONES (1 references)
target     prot opt source               destination         
FWDO_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain FORWARD_direct (1 references)
target     prot opt source               destination         

Chain FWDI_docker (1 references)
target     prot opt source               destination         
FWDI_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_docker_allow (1 references)
target     prot opt source               destination         

Chain FWDI_docker_deny (1 references)
target     prot opt source               destination         

Chain FWDI_docker_log (1 references)
target     prot opt source               destination         

Chain FWDI_docker_post (1 references)
target     prot opt source               destination         

Chain FWDI_docker_pre (1 references)
target     prot opt source               destination         

Chain FWDI_public (3 references)
target     prot opt source               destination         
FWDI_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 0
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 8

Chain FWDI_public_deny (1 references)
target     prot opt source               destination         

Chain FWDI_public_log (1 references)
target     prot opt source               destination         

Chain FWDI_public_post (1 references)
target     prot opt source               destination         

Chain FWDI_public_pre (1 references)
target     prot opt source               destination         

Chain FWDI_trusted (2 references)
target     prot opt source               destination         
FWDI_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_trusted_allow (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_deny (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_log (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_post (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_pre (1 references)
target     prot opt source               destination         

Chain FWDO_docker (1 references)
target     prot opt source               destination         
FWDO_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_docker_allow (1 references)
target     prot opt source               destination         

Chain FWDO_docker_deny (1 references)
target     prot opt source               destination         

Chain FWDO_docker_log (1 references)
target     prot opt source               destination         

Chain FWDO_docker_post (1 references)
target     prot opt source               destination         

Chain FWDO_docker_pre (1 references)
target     prot opt source               destination         

Chain FWDO_public (3 references)
target     prot opt source               destination         
FWDO_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate NEW,UNTRACKED

Chain FWDO_public_deny (1 references)
target     prot opt source               destination         

Chain FWDO_public_log (1 references)
target     prot opt source               destination         

Chain FWDO_public_post (1 references)
target     prot opt source               destination         

Chain FWDO_public_pre (1 references)
target     prot opt source               destination         

Chain FWDO_trusted (2 references)
target     prot opt source               destination         
FWDO_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_trusted_allow (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_deny (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_log (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_post (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_pre (1 references)
target     prot opt source               destination         

Chain INPUT_ZONES (1 references)
target     prot opt source               destination         
IN_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain INPUT_direct (1 references)
target     prot opt source               destination         

Chain IN_docker (1 references)
target     prot opt source               destination         
IN_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_docker_allow (1 references)
target     prot opt source               destination         

Chain IN_docker_deny (1 references)
target     prot opt source               destination         

Chain IN_docker_log (1 references)
target     prot opt source               destination         

Chain IN_docker_post (1 references)
target     prot opt source               destination         

Chain IN_docker_pre (1 references)
target     prot opt source               destination         

Chain IN_public (3 references)
target     prot opt source               destination         
IN_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 0
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 8
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:80 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:22 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:4149 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:2379:2380 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:4240 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:6060:6062 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:9890:9893 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:9876 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:9090 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:4244:4245 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6942 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:8472 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:51871 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:7443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10257 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10259 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10000 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10250 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:7946 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:7946 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:2812 ctstate NEW,UNTRACKED

Chain IN_public_deny (1 references)
target     prot opt source               destination         

Chain IN_public_log (1 references)
target     prot opt source               destination         

Chain IN_public_post (1 references)
target     prot opt source               destination         

Chain IN_public_pre (1 references)
target     prot opt source               destination         

Chain IN_trusted (2 references)
target     prot opt source               destination         
IN_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_trusted_allow (1 references)
target     prot opt source               destination         

Chain IN_trusted_deny (1 references)
target     prot opt source               destination         

Chain IN_trusted_log (1 references)
target     prot opt source               destination         

Chain IN_trusted_post (1 references)
target     prot opt source               destination         

Chain IN_trusted_pre (1 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain OUTPUT_direct (1 references)
target     prot opt source               destination         

Here are the results of iptables -n --list for my worker:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
CILIUM_INPUT  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_INPUT */
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED,DNAT
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
INPUT_direct  all  --  0.0.0.0/0            0.0.0.0/0           
INPUT_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0            ctstate INVALID
REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
CILIUM_FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_FORWARD */
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED,DNAT
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_direct  all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_IN_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
FORWARD_OUT_ZONES  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0            ctstate INVALID
REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
CILIUM_OUTPUT  all  --  0.0.0.0/0            0.0.0.0/0            /* cilium-feeder: CILIUM_OUTPUT */
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
OUTPUT_direct  all  --  0.0.0.0/0            0.0.0.0/0           

Chain CILIUM_FORWARD (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: any->cluster on cilium_host forward accept */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on cilium_host forward accept (nodeport) */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on lxc+ forward accept */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* cilium: cluster->any on cilium_net forward accept (nodeport) */

Chain CILIUM_INPUT (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            mark match 0x200/0xf00 /* cilium: ACCEPT for proxy traffic */

Chain CILIUM_OUTPUT (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            mark match 0xa00/0xfffffeff /* cilium: ACCEPT for proxy return traffic */
MARK       all  --  0.0.0.0/0            0.0.0.0/0            mark match ! 0xe00/0xf00 mark match ! 0xd00/0xf00 mark match ! 0xa00/0xe00 /* cilium: host->any mark as from host */ MARK xset 0xc00/0xf00

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FORWARD_IN_ZONES (1 references)
target     prot opt source               destination         
FWDI_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDI_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain FORWARD_OUT_ZONES (1 references)
target     prot opt source               destination         
FWDO_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
FWDO_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain FORWARD_direct (1 references)
target     prot opt source               destination         

Chain FWDI_docker (1 references)
target     prot opt source               destination         
FWDI_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_docker_allow (1 references)
target     prot opt source               destination         

Chain FWDI_docker_deny (1 references)
target     prot opt source               destination         

Chain FWDI_docker_log (1 references)
target     prot opt source               destination         

Chain FWDI_docker_post (1 references)
target     prot opt source               destination         

Chain FWDI_docker_pre (1 references)
target     prot opt source               destination         

Chain FWDI_public (3 references)
target     prot opt source               destination         
FWDI_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 0
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 8

Chain FWDI_public_deny (1 references)
target     prot opt source               destination         

Chain FWDI_public_log (1 references)
target     prot opt source               destination         

Chain FWDI_public_post (1 references)
target     prot opt source               destination         

Chain FWDI_public_pre (1 references)
target     prot opt source               destination         

Chain FWDI_trusted (2 references)
target     prot opt source               destination         
FWDI_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDI_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDI_trusted_allow (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_deny (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_log (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_post (1 references)
target     prot opt source               destination         

Chain FWDI_trusted_pre (1 references)
target     prot opt source               destination         

Chain FWDO_docker (1 references)
target     prot opt source               destination         
FWDO_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_docker_allow (1 references)
target     prot opt source               destination         

Chain FWDO_docker_deny (1 references)
target     prot opt source               destination         

Chain FWDO_docker_log (1 references)
target     prot opt source               destination         

Chain FWDO_docker_post (1 references)
target     prot opt source               destination         

Chain FWDO_docker_pre (1 references)
target     prot opt source               destination         

Chain FWDO_public (3 references)
target     prot opt source               destination         
FWDO_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate NEW,UNTRACKED

Chain FWDO_public_deny (1 references)
target     prot opt source               destination         

Chain FWDO_public_log (1 references)
target     prot opt source               destination         

Chain FWDO_public_post (1 references)
target     prot opt source               destination         

Chain FWDO_public_pre (1 references)
target     prot opt source               destination         

Chain FWDO_trusted (2 references)
target     prot opt source               destination         
FWDO_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
FWDO_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain FWDO_trusted_allow (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_deny (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_log (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_post (1 references)
target     prot opt source               destination         

Chain FWDO_trusted_pre (1 references)
target     prot opt source               destination         

Chain INPUT_ZONES (1 references)
target     prot opt source               destination         
IN_docker  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_trusted  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 
IN_public  all  --  0.0.0.0/0            0.0.0.0/0           [goto] 

Chain INPUT_direct (1 references)
target     prot opt source               destination         

Chain IN_docker (1 references)
target     prot opt source               destination         
IN_docker_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_docker_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_docker_allow (1 references)
target     prot opt source               destination         

Chain IN_docker_deny (1 references)
target     prot opt source               destination         

Chain IN_docker_log (1 references)
target     prot opt source               destination         

Chain IN_docker_post (1 references)
target     prot opt source               destination         

Chain IN_docker_pre (1 references)
target     prot opt source               destination         

Chain IN_public (3 references)
target     prot opt source               destination         
IN_public_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_public_post  all  --  0.0.0.0/0            0.0.0.0/0           
DROP       all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_public_allow (1 references)
target     prot opt source               destination         
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 0
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            icmptype 8
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:80 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:22 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:4149 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:2379:2380 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:4240 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:6060:6062 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:9890:9893 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:9876 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:9090 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:4244:4245 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6942 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:8472 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:51871 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:24007 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:24007 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:24008 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:24008 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:49152 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:49152 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:7443 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10250 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:7946 ctstate NEW,UNTRACKED
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:7946 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:2812 ctstate NEW,UNTRACKED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpts:30000:32767 ctstate NEW,UNTRACKED

Chain IN_public_deny (1 references)
target     prot opt source               destination         

Chain IN_public_log (1 references)
target     prot opt source               destination         

Chain IN_public_post (1 references)
target     prot opt source               destination         

Chain IN_public_pre (1 references)
target     prot opt source               destination         

Chain IN_trusted (2 references)
target     prot opt source               destination         
IN_trusted_pre  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_log  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_deny  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_allow  all  --  0.0.0.0/0            0.0.0.0/0           
IN_trusted_post  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain IN_trusted_allow (1 references)
target     prot opt source               destination         

Chain IN_trusted_deny (1 references)
target     prot opt source               destination         

Chain IN_trusted_log (1 references)
target     prot opt source               destination         

Chain IN_trusted_post (1 references)
target     prot opt source               destination         

Chain IN_trusted_pre (1 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain OUTPUT_direct (1 references)
target     prot opt source               destination         

jerrac avatar Jan 11 '22 16:01 jerrac

Are you able to narrow down the list of rules by inspecting the drop counters via iptables-save -c ...?

christarazi avatar Jan 11 '22 22:01 christarazi

Does this help?

# iptables-save -c | grep -i drop 
:KUBE-MARK-DROP - [0:0]
[0:0] -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
[0:0] -A INPUT -m conntrack --ctstate INVALID -j DROP
[0:0] -A FORWARD -m conntrack --ctstate INVALID -j DROP
[0:0] -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
[0:0] -A FWDI_public -j DROP
[0:0] -A FWDO_public -j DROP
[1407:402029] -A IN_public -j DROP
[0:0] -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
[0:0] -A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP

Apologies, my lack of in depth iptables experience is showing... :\

jerrac avatar Jan 12 '22 17:01 jerrac

Yes, progress :). Can you correlate increments to the drop counters when you see the issue? Conversely if you take down firewalld, do these counters stay the same?

christarazi avatar Jan 12 '22 23:01 christarazi

With firewalld enabled, before connectivity test:

iptables-save -c | grep -i drop 
:KUBE-MARK-DROP - [0:0]
[0:0] -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
[0:0] -A INPUT -m conntrack --ctstate INVALID -j DROP
[0:0] -A FORWARD -m conntrack --ctstate INVALID -j DROP
[0:0] -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
[0:0] -A FWDI_public -j DROP
[0:0] -A FWDO_public -j DROP
[2:2358] -A IN_public -j DROP
[0:0] -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
[0:0] -A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP

With firewalld enabled, after test:

:KUBE-MARK-DROP - [0:0]
[0:0] -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
[0:0] -A INPUT -m conntrack --ctstate INVALID -j DROP
[0:0] -A FORWARD -m conntrack --ctstate INVALID -j DROP
[0:0] -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
[0:0] -A FWDI_public -j DROP
[0:0] -A FWDO_public -j DROP
[100:33042] -A IN_public -j DROP
[0:0] -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
[0:0] -A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP

It sure looks like the -A IN_public -j DROP is dropping a lot of packets (the second number is how many are dropped, right?).

If I disable firewalld and restart the vms, this is what I see on the control plane:

Before the test:

# iptables-save -c | grep -i drop 
:FORWARD DROP [4:240]
[0:0] -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
[0:0] -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
[0:0] -A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
:KUBE-MARK-DROP - [0:0]
[0:0] -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000


After the successful tests with firewalld off:

# iptables-save -c | grep -i drop 
:FORWARD DROP [0:0]
[0:0] -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
[0:0] -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
[0:0] -A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
:KUBE-MARK-DROP - [0:0]
[0:0] -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000

If I'm interpreting this correctly, the -A IN_public -j DROP is to blame. I think that rule comes from me setting <zone target="DROP"> in the firewalld zone file. The idea being I don't want to accept any traffic not explicitly allowed. But I've allowed all the ports k8s and cilium should need, at least as far as I can tell, so why is it still getting hit?

Or am I completely off on my thoughts?

jerrac avatar Jan 13 '22 16:01 jerrac

I finally turned on logging in firewalld. What I'm seeing is interesting. Assuming I understand the logs correctly. I see a lot of traffic hitting the IN_public_DROP rule.

Jan 13 10:10:42 dkdevcp01 kernel: [ 1233.678855] IN_public_DROP: IN=eth0 OUT= MAC=52:54:00:95:b9:bf:52:54:00:a4:0e:89:08:00 SRC=192.168.121.180 DST=192.168.121.222 LEN=127 TOS=0x00 PREC=0x00 TTL=62 ID=10131 DF PROTO=TCP SPT=4240 DPT=60832 WINDOW=502 RES=0x00 ACK PSH URGP=0 

Where 192.168.121.222 is the cilium-* pod on the control plane, and 192.168.121.180 is the cilium-* pod on the worker node.

This just the normal traffic, not during a connectivity test. So I'm not sure this is directly applicable to the test issue, but I am a bit confused. I'm seeing a lot of these lines with the source port being on the allowed list, but the dest port being random, and not in an allowed range. Is that expected?

Another thing I noticed is that when I run iptables-save -c, I see the -A IN_public -j DROP rule listed before the allow rules for all the ports. Which I'd expect to be after all the allow rules.

Does any of that shed any light for my issues?

jerrac avatar Jan 13 '22 18:01 jerrac

@jerrac I'm not familiar with firewalld, but some searching led me to https://firewalld.org/2018/12/rich-rule-priorities which seems quite relevant to you.

christarazi avatar Jan 18 '22 01:01 christarazi

Hmm... Rich rules might be one way to get around things, but I haven't done anything with them before...

I did find this comment in firewalld's issue queue: https://github.com/firewalld/firewalld/issues/767#issuecomment-790687269 That led to me adding cilium_host, cilium_net, and cilium_vxlan to the trusted zone. After restarting everything, the connectivity tests all passed just fine. :\

That makes me wonder if I need to be concerned about the plethora of virtual nics that Kube creates. All the lxc* nics listed in ifconfig's output. There's 1 per pod, right? Do I need to somehow make sure they're also added to the trusted zone? I mean, I thought that Kube would add/modify any needed rules for virtual nics it creates when it starts up, presumably after firewalld. But since manually adding the cilium_* nics to the trusted zone made a difference, I'm not sure anymore.

Guess I'd better dive into how Kube networking works a bit deeper.

jerrac avatar Jan 21 '22 17:01 jerrac

Hi,

I encounter the same issue, but when I add the other nic to the trusted zone and do a reboot, the connectivity test fails. I've added all the needed rules in firewalld. When I disable firewalld, the connectivity test successes.

LeoShivas avatar Aug 12 '23 17:08 LeoShivas

My installation is done like this :

Hypervisor

I use Promox 7.4. Here are the file used for generate the Rocky Linux 9 minimal install templae : Packer file :

packer {
  required_plugins {
    proxmox = {
      version = ">= 1.1.2"
      source  = "github.com/hashicorp/proxmox"
    }
  }
}

local "template-date" {
  expression = formatdate("DD MMM YYYY hh:mm ZZZ", timestamp())
}

local "sub_string" {
  expression = "@@@@@"
}
variable "iso_url" {
  type    = string
  default = ""
}

variable "iso_checksum" {
  type    = string
  default = ""
}

variable "adm_pwd" {
  type    = string
  default = ""
}

variable "adm_username" {
  type    = string
  default = ""
}

variable "adm_ssh_public_key" {
  type    = string
  default = ""
}

variable "prx_node" {
  type    = string
  default = ""
}

variable "github_token" {
  type    = string
  default = ""
}

variable "github_repo" {
  type    = string
  default = ""
}

variable "github_ref_name" {
  type    = string
  default = ""
}

variable "bind_ip_address" {
  type    = string
  default = ""
}

variable "bind_ssh_port" {
  type    = number
  default = 22
}

variable "bind_ssh_user" {
  type    = string
  default = ""
}

# https://github.com/hashicorp/packer-plugin-proxmox/blob/main/docs/builders/iso.mdx
source "proxmox-iso" "main" {
  node                     = var.prx_node
  iso_download_pve         = true
  iso_url                  = var.iso_url
  iso_storage_pool         = "local"
  iso_checksum             = var.iso_checksum
  boot_command             = ["<up><tab> ADM_NAME=${var.adm_username} ADM_PWD=${bcrypt(var.adm_pwd)} ADM_SSH_PUBLIC_KEY=${replace(var.adm_ssh_public_key," ",local.sub_string)} SUB_STRING=${local.sub_string} inst.cmdline inst.ks=https://${var.github_token}@raw.githubusercontent.com/${var.github_repo}/${var.github_ref_name}/kickstart/ks-rocky9.cfg<enter><wait7m>"]
  http_directory           = "kickstart"
  insecure_skip_tls_verify = true
  memory                   = 8192
  cores                    = 6
  cpu_type                 = "host"
  os                       = "l26"
  network_adapters {
    model = "virtio"
    bridge = "vmbr1"
  }
  disks {
    disk_size    = "50G"
    storage_pool = "local"
    type         = "virtio"
  }
  template_name                = "rocky9tpl"
  template_description         = "Rocky 9.1 x86_64 minimal, generated on ${local.template-date}"
  unmount_iso                  = true
  onboot                       = true
  qemu_agent                   = true
  ssh_username                 = var.adm_username
  ssh_private_key_file         = "~/.ssh/id_rsa"
  ssh_timeout                  = "10m"
  ssh_host                     = "rocky9tpl"
  ssh_bastion_host             = var.bind_ip_address
  ssh_bastion_port             = var.bind_ssh_port
  ssh_bastion_username         = var.bind_ssh_user
  ssh_bastion_private_key_file = "~/.ssh/id_rsa"
}

build {
  sources = ["source.proxmox-iso.main"]
}

And Kickstart file :

%pre --log=/tmp/pre.log
replace=" "
for x in $(cat /proc/cmdline); do
case $x in
ADM_NAME=*)
    adm_name=${x#ADM_NAME=}
    ;;
ADM_PWD=*)
    adm_pwd_crypted=${x#ADM_PWD=}
    ;;
ADM_SSH_PUBLIC_KEY=*)
    adm_ssh_public_key=${x#ADM_SSH_PUBLIC_KEY=}
    ;;
SUB_STRING=*)
    sub_string=${x#SUB_STRING=}
    ;;
esac
done
echo "user --groups=wheel --name=${adm_name} --uid=1000 --password=${adm_pwd_crypted} --iscrypted" > /tmp/create_user
cat << EOF > /tmp/add_key
mkdir -p /mnt/sysimage/home/${adm_name}/.ssh
echo "${adm_ssh_public_key//$sub_string/$replace}" > /mnt/sysimage/home/${adm_name}/.ssh/authorized_keys
chmod 600 /mnt/sysimage/home/${adm_name}/.ssh/authorized_keys
chown 1000:1000 /mnt/sysimage/home/${adm_name}/.ssh/authorized_keys
EOF

%end

# Use graphical install
#graphical
repo --name="minimal" --baseurl=file:///run/install/sources/mount-0000-cdrom/minimal

%addon com_redhat_kdump --disable

%end

# Keyboard layouts
keyboard --xlayouts='fr (oss)'
# System language
lang fr_FR.UTF-8

# Network information
network --bootproto=dhcp --device=ens18 --noipv6 --activate
network --hostname=rocky9tpl

# Use CDROM installation media
cdrom

%packages
@^minimal-environment
@guest-agents
bash-completion

%end

# Run the Setup Agent on first boot
#firstboot --enable
firstboot --disable
eula --agreed

# Generated using Blivet version 3.4.0
ignoredisk --only-use=vda
# System bootloader configuration
bootloader --location=mbr --boot-drive=vda
# Partition clearing information
clearpart --all --initlabel --drives=vda
# Disk partitioning information
part pv.111 --fstype="lvmpv" --ondisk=vda --size=1 --grow
part /boot --fstype="xfs" --ondisk=vda --size=1024
volgroup vg_root pv.111
logvol swap --fstype="swap" --size=2047 --name=lv_swap --vgname=vg_root
logvol / --fstype="xfs" --percent=100 --name=lv_root --vgname=vg_root

# System timezone
timezone Europe/Paris --utc

#Root password
rootpw --lock
%include /tmp/create_user

reboot

%post --nochroot
%include /tmp/add_key
sed -i 's/^%wheel/# %wheel/' /mnt/sysimage/etc/sudoers
sed -i 's/^# %wheel\tALL=(ALL)\tNOPASSWD: ALL/%wheel\tALL=(ALL)\tNOPASSWD: ALL/' /mnt/sysimage/etc/sudoers
%end

OS

Rocky Linux 9 minimal based installation deployed by the following TF file for creating the VM :

resource "macaddress" "kube_cp" {
  count = var.control_plane_count
}

resource "proxmox_vm_qemu" "kube_cp" {
  count                  = var.control_plane_count
  name                   = "kube-cp-${count.index + 1}"
  target_node            = var.prx_node
  clone                  = "rocky9tpl"
  desc                   = "Rocky Linux 9 VM fully cloned from rocky9tpl"
  agent                  = 1
  boot                   = "order=virtio0;ide2;net0"
  cores                  = 2
  define_connection_info = false
  force_create           = true
  memory                 = 2048
  onboot                 = true
  qemu_os                = "l26"

  disk {
    type    = "virtio"
    storage = "local"
    size    = "50G"
  }

  network {
    bridge  = "vmbr1"
    model   = "virtio"
    macaddr = macaddress.kube_cp[count.index].address
  }

  provisioner "file" {
    connection {
      user                = var.adm_username
      private_key         = var.adm_private_key
      host                = "rocky9tpl"
      bastion_host        = var.bind_ip_address
      bastion_port        = var.bind_ssh_port
      bastion_user        = var.bind_ssh_user
      bastion_private_key = var.bind_ssh_private_key
    }
    source      = "../files/post_hostname.sh"
    destination = "/tmp/script.sh"
  }

  provisioner "remote-exec" {
    connection {
      user                = var.adm_username
      private_key         = var.adm_private_key
      host                = "rocky9tpl"
      bastion_host        = var.bind_ip_address
      bastion_port        = var.bind_ssh_port
      bastion_user        = var.bind_ssh_user
      bastion_private_key = var.bind_ssh_private_key
    }
    inline = [
      "sudo chmod +x /tmp/script.sh",
      "sudo /tmp/script.sh ${self.name}",
    ]
  }
}

Here is the simple post_hostname.sh :

#!/usr/bin/env bash
hostnamectl set-hostname $1
nmcli c down $(nmcli c|grep ethernet|awk '{print $1}')
sleep 2
nmcli c up $(nmcli c|grep ethernet|awk '{print $1}')
shutdown -r +1

Kubernetes installation

Requirements I've installed the requirement packages with Ansible by this way :

---
- name: add docker centos repo
  yum_repository:
    name: docker-ce-stable
    description: Docker CE Stable - $basearch
    baseurl: https://download.docker.com/linux/centos/$releasever/$basearch/stable
    gpgkey: https://download.docker.com/linux/centos/gpg
    enabled: yes
    gpgcheck: yes
    state: present
- name: add kubernetes repo
  yum_repository:
    name: kubernetes
    description: Kubernetes
    baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-$basearch
    enabled: yes
    gpgcheck: yes
    repo_gpgcheck: yes
    gpgkey: https://packages.cloud.google.com/yum/doc/yum-key.gpg
      https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
    state: present
    exclude:
      - kubelet
      - kubeadm
      - kubectl
---
- name: Update and upgrade dnf
  dnf:
    name: '*'
    state: latest
    update_cache: yes
---
- name: disable swap
  command: swapoff -a
  when: ansible_swaptotal_mb > 0
- name: remove swap from fstab
  lineinfile:
    path: /etc/fstab
    regexp: '^.*swap.*$'
    state: absent
  when: ansible_swaptotal_mb > 0
---
- name: enable br_netfilter
  modprobe:
    name: br_netfilter
    state: present
    persistent: present
- name: enable overlay
  modprobe:
    name: overlay
    state: present
    persistent: present
- name: enable net.bridge.bridge-nf-call-iptables
  sysctl:
    name: net.bridge.bridge-nf-call-iptables
    value: 1
    sysctl_set: yes
    state: present
    reload: yes
- name: enable net.bridge.bridge-nf-call-ip6tables
  sysctl:
    name: net.bridge.bridge-nf-call-ip6tables
    value: 1
    sysctl_set: yes
    state: present
    reload: yes
- name: enable net.ipv4.ip_forward
  sysctl:
    name: net.ipv4.ip_forward
    value: 1
    sysctl_set: yes
    state: present
    reload: yes
---
- name: install containerd
  dnf:
    name:
      - containerd.io
    state: present
    update_cache: yes
- name: enable containerd
  systemd:
    name: containerd
    enabled: yes
    state: started
- name: Retrieve containerd configuration
  command: containerd config default
  register: containerd_config
  changed_when: containerd_config.stdout == ""
  failed_when: containerd_config.rc != 0
- name: Save containerd configuration
  vars:
    containerd_config_updated: "{{ containerd_config.stdout | regex_replace('(SystemdCgroup = ).*', '\\1true') }}"
  copy:
    content: "{{ containerd_config_updated }}"
    dest: /etc/containerd/config.toml
  notify: Restart containerd
  when: containerd_config_updated != containerd_config
---
# https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-using-native-package-management
- name: install kubernetes
  dnf:
    name:
      - kubelet-1.27.3
      - kubeadm-1.27.3
      - kubectl-1.27.3
    disable_excludes: kubernetes
    state: present
    update_cache: yes
  notify: Daemon Reload
- name: enable kubelet
  systemd:
    name: kubelet
    enabled: yes
    state: started
- name: Get kubectl completion
  shell: kubectl completion bash
  register: kubectl_completion_script
  changed_when: false
- name: Create kubectl completion script file
  file:
    path: /etc/bash_completion.d/kubectl
    state: touch
    mode: a+r
    modification_time: preserve
    access_time: preserve
- name: Retrieve kubectl completion script content
  slurp:
    src: /etc/bash_completion.d/kubectl
  register: kubectl_completion_script_file
- name: Write kubectl completion script
  copy:
    dest: /etc/bash_completion.d/kubectl
    content: "{{ kubectl_completion_script.stdout }}"
  when: kubectl_completion_script.stdout != kubectl_completion_script_file.content
- name: Write kubectl profile script
  copy:
    dest: /etc/profile.d/kubectl.sh
    content: |
      alias k=kubectl
      complete -o default -F __start_kubectl k
    mode: a+r
---
# https://kubernetes.io/docs/reference/networking/ports-and-protocols/
- name: Open Kubelet API port
  ansible.posix.firewalld:
    port: "10250/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open NodePort Services port
  ansible.posix.firewalld:
    port: "30000-32767/tcp"
    permanent: true
    state: enabled
    immediate: true
  when: ansible_hostname in groups['kube_wk']
# https://docs.cilium.io/en/stable/operations/system_requirements/
- name: Open Cilium VXLAN overlay port
  ansible.posix.firewalld:
    port: "8472/udp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Cilium cluster health checks port
  ansible.posix.firewalld:
    port: "4240/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Cilium Hubble server port
  ansible.posix.firewalld:
    port: "4244/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Cilium Hubble Relay port
  ansible.posix.firewalld:
    port: "4245/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Mutual Authentication port port
  ansible.posix.firewalld:
    port: "4250/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Spire Agent health check port port
  ansible.posix.firewalld:
    port: "4251/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-agent pprof server port
  ansible.posix.firewalld:
    port: "6060/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-operator pprof server port
  ansible.posix.firewalld:
    port: "6061/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Hubble Relay pprof server port
  ansible.posix.firewalld:
    port: "6062/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-envoy health listener port
  ansible.posix.firewalld:
    port: "9878/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-agent health status API port
  ansible.posix.firewalld:
    port: "9879/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-agent gops server port
  ansible.posix.firewalld:
    port: "9890/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open operator gops server port
  ansible.posix.firewalld:
    port: "9891/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open Hubble Relay gops server port
  ansible.posix.firewalld:
    port: "9893/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-agent Prometheus metrics port
  ansible.posix.firewalld:
    port: "9962/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-operator Prometheus metrics port
  ansible.posix.firewalld:
    port: "9963/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open cilium-envoy Prometheus metrics port
  ansible.posix.firewalld:
    port: "9964/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open WireGuard encryption tunnel endpoint port
  ansible.posix.firewalld:
    port: "51871/udp"
    permanent: true
    state: enabled
    immediate: true
# https://kubernetes.io/docs/tasks/administer-cluster/ip-masq-agent/
- name: Enable IP masquerade
  ansible.posix.firewalld:
    masquerade: true
    state: enabled
    permanent: true

Control plane ports opening Then I've opened the control plane nodes ports with the Ansible file :

---
# https://kubernetes.io/docs/reference/networking/ports-and-protocols/
- name: Open Kubernetes API server port
  ansible.posix.firewalld:
    port: "6443/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open etcd server client API port
  ansible.posix.firewalld:
    port: "2379-2380/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open kube-scheduler port
  ansible.posix.firewalld:
    port: "10259/tcp"
    permanent: true
    state: enabled
    immediate: true
- name: Open kube-controller-manager port
  ansible.posix.firewalld:
    port: "10257/tcp"
    permanent: true
    state: enabled
    immediate: true

Initialization step And I've initialized the first control plane with this Ansible file :

- name: Play all tasks for initializing Kubernetes cluster
  block:
  - name: Check if Kubernetes is already initialized
    shell: kubectl cluster-info
    register: kubectl_cluster_info
    changed_when: false
    failed_when: false
    become: no
    ignore_errors: true
  - name: Add Kubenertes enpoint to /etc/hosts
    lineinfile:
      path: /etc/hosts
      line: "127.0.0.1 {{ kube_endpoint }}"
    when: kubectl_cluster_info.rc != 0
  - name: Initialize Kubernetes cluster
    shell: kubeadm init --control-plane-endpoint "{{ kube_endpoint }}" --upload-certs | tee kubeadm-init-`date '+%Y-%m-%d_%H-%M-%S'`.out
    when: kubectl_cluster_info.rc != 0
  - name: Remove Kubenertes enpoint from /etc/hosts
    lineinfile:
      path: /etc/hosts
      line: "127.0.0.1 {{ kube_endpoint }}"
      state: absent
  - name: Create .kube directory
    become: no
    file:
      path: /home/{{ ansible_user }}/.kube
      state: directory
  - name: Copy admin.conf to .kube directory
    copy:
      src: /etc/kubernetes/admin.conf
      dest: /home/{{ ansible_user }}/.kube/config
      owner: "{{ ansible_user }}"
      group: "{{ ansible_user }}"
      remote_src: true
  - name: Check if Kubernetes is successfully initialized
    shell: kubectl cluster-info
    register: kubectl_cluster_info
    changed_when: false
    failed_when: kubectl_cluster_info.rc != 0
    become: no
  when: ansible_hostname == hostvars[ansible_hostname]['groups']['kube_first_cp'][0]

Cilium installation Then, the most important part, I've installed Cilium with these Ansible steps :

---
# tasks file for cilium
- name: Play all tasks for installing cilium stack
  block:
  - name: Retreive cilium CLI version
    shell: curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt
    register: cilium_cli_version
    changed_when: false
  - name: Download cilium CLI
    get_url:
      url: https://github.com/cilium/cilium-cli/releases/download/{{ cilium_cli_version.stdout }}/cilium-linux-amd64.tar.gz
      checksum: sha256:https://github.com/cilium/cilium-cli/releases/download/{{ cilium_cli_version.stdout }}/cilium-linux-amd64.tar.gz.sha256sum
      dest: /tmp/cilium-linux-{{ cilium_cli_version.stdout }}.tar.gz
    when: ansible_architecture == "x86_64"
  - name: Extract cilium CLI
    unarchive:
      src: /tmp/cilium-linux-{{ cilium_cli_version.stdout }}.tar.gz
      dest: /usr/local/bin
      owner: root
      group: root
      remote_src: yes
  - name: Install cilium CNI block
    block:
    - name: Check if cilium is not installed
      shell: cilium status | grep -E "Cilium:|Operator:" | grep OK | wc -l
      register: cilium_status
      changed_when: false
      failed_when: false
    - name: Install cilium CNI
      block:
      - name: Retreive cilium CNI version
        shell: curl -s https://raw.githubusercontent.com/cilium/cilium/main/stable.txt
        register: cilium_version
        changed_when: false
      - name: Install cilium
        shell: cilium install --version {{ cilium_version.stdout }}
        register: cilium_install
        failed_when: cilium_install.rc != 0
      - name: Check if cilium is well installed
        shell: cilium status --wait | grep -E "Cilium:|Operator:"|grep OK|wc -l
        register: cilium_status_install
        changed_when: false
        failed_when: cilium_status_install.stdout != "2"
      when: cilium_status.stdout != "2"
    become: false
  when: ansible_hostname == hostvars[ansible_hostname]['groups']['kube_first_cp'][0]
- ansible.posix.firewalld:
    zone: trusted
    interface: cilium_vxlan
    permanent: true
    state: enabled
- ansible.posix.firewalld:
    zone: trusted
    interface: cilium_net
    permanent: true
    state: enabled
- ansible.posix.firewalld:
    zone: trusted
    interface: cilium_host
    permanent: true
    state: enabled

Join step Finally, I've joined all my nodes with this :

---
- name: Check join status
  shell: kubectl get nodes {{ ansible_hostname }}
  register: join_status
  delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
  become: no
  changed_when: false
  failed_when: false
- name: Join CP
  block:
  - name: Control Planes join command
    block:
    - name: Get CP join command
      shell: kubeadm token create --print-join-command --certificate-key $(sudo kubeadm init phase upload-certs --upload-certs | tail -1)
      register: join_cp_command
      become: no
      delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
      changed_when: false
    - name: CP join execution
      shell: "{{ join_cp_command.stdout }} | tee kubeadm-join-`date '+%Y-%m-%d_%H-%M-%S'`.out"
    when: ansible_hostname in groups['kube_cp']
  - name: Workers join command
    block:
    - name: Get worker join command
      shell: kubeadm token create --print-join-command
      register: join_wk_command
      become: no
      delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
      changed_when: false
    - name: Worker join execution
      shell: "{{ join_wk_command.stdout }} | tee kubeadm-join-`date '+%Y-%m-%d_%H-%M-%S'`.out"
    when: ansible_hostname in groups['kube_wk']
  when: join_status.rc != 0
- name: Post join control plane
  block:
  - name: Create .kube directory
    become: no
    file:
      path: /home/{{ ansible_user }}/.kube
      state: directory
  - name: Retreive admin.conf from first CP
    fetch:
      src: /etc/kubernetes/admin.conf
      dest: /tmp/
      flat: true
    delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
    changed_when: false
  - name: Copy admin.conf to .kube directory
    copy:
      src: /tmp/admin.conf
      dest: /home/{{ ansible_user }}/.kube/config
      owner: "{{ ansible_user }}"
      group: "{{ ansible_user }}"
  when: ansible_hostname in groups['kube_cp']
- name: Post join worker
  block:
  - name: Check if node is labeled
    shell: kubectl get nodes -l 'node-role.kubernetes.io/worker in ()'|grep ^{{ ansible_hostname }}
    register: node_labels
    become: no
    delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
    changed_when: false
    failed_when: false
  - name: Labeling node
    shell: "kubectl label nodes {{ ansible_hostname }} node-role.kubernetes.io/worker="
    become: no
    delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
    when: node_labels.rc != 0
  when: ansible_hostname in groups['kube_wk']
- name: Check join status
  shell: kubectl get nodes {{ ansible_hostname }}
  register: join_status
  delegate_to: "{{ hostvars[ansible_hostname]['groups']['kube_first_cp'][0] }}"
  become: no
  changed_when: false

LeoShivas avatar Aug 14 '23 07:08 LeoShivas

For the connectivity test, I've enable Hubble with :

[myuser@kube-cp-1 ~]$ cilium hubble enable
[myuser@kube-cp-1 ~]$ cilium hubble port-forward &
[1] 2872
[myuser@kube-cp-1 ~]$

And, then, I 've launched the test with cilium connectivity test --force-deploy :

[jlnadm@kube-cp-1 ~]$ cilium connectivity test --force-deploy
ℹ️  Monitor aggregation detected, will skip some flow validation steps
🔥 [kubernetes] Deleting connectivity check deployments...
⌛ [kubernetes] Waiting for namespace cilium-test to disappear
✨ [kubernetes] Creating namespace cilium-test for connectivity check...
✨ [kubernetes] Deploying echo-same-node service...
✨ [kubernetes] Deploying DNS test server configmap...
✨ [kubernetes] Deploying same-node deployment...
✨ [kubernetes] Deploying client deployment...
✨ [kubernetes] Deploying client2 deployment...
✨ [kubernetes] Deploying echo-other-node service...
✨ [kubernetes] Deploying other-node deployment...
✨ [host-netns] Deploying kubernetes daemonset...
✨ [host-netns-non-cilium] Deploying kubernetes daemonset...
✨ [kubernetes] Deploying echo-external-node deployment...
⌛ [kubernetes] Waiting for deployment cilium-test/client to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/client2 to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client-6b4b857d98-pw2rs to appear...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client2-646b88fb9b-fr2wz to appear...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-pw2rs to reach DNS server on cilium-test/echo-same-node-965bbc7d4-s6dhj pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-fr2wz to reach DNS server on cilium-test/echo-same-node-965bbc7d4-s6dhj pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-pw2rs to reach DNS server on cilium-test/echo-other-node-545c9b778b-pzb8d pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-fr2wz to reach DNS server on cilium-test/echo-other-node-545c9b778b-pzb8d pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-pw2rs to reach default/kubernetes service...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-fr2wz to reach default/kubernetes service...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-545c9b778b-pzb8d to appear...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-965bbc7d4-s6dhj to appear...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to be synchronized by Cilium pod kube-system/cilium-j4mbf
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to be synchronized by Cilium pod kube-system/cilium-j4mbf
⌛ [kubernetes] Waiting for NodePort 192.168.1.138:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.138:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.152:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.152:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.109:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.109:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.136:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.136:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.151:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.151:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.150:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.150:31689 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.137:32686 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.1.137:31689 (cilium-test/echo-same-node) to become ready...
ℹ️  Skipping IPCache check
🔭 Enabling Hubble telescope...
ℹ️  Hubble is OK, flows: 22000/28665
ℹ️  Cilium version: 1.14.0
🏃 Running tests...
[=] Test [no-policies]
..........................
[...]

And here are the results :

📋 Test Report
❌ 7/42 tests failed (42/335 actions), 13 tests skipped, 1 scenarios skipped:
Test [echo-ingress-l7]:
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-3-public: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-public (10.0.4.124:8080)
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-3-private: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-private (10.0.4.124:8080)
  ❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-ipv4-3-privatewith-header: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-privatewith-header (10.0.4.124:8080)
Test [echo-ingress-l7-named-port]:
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-3-public: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-public (10.0.4.124:8080)
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-3-private: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-private (10.0.4.124:8080)
  ❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-ipv4-3-privatewith-header: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-3-privatewith-header (10.0.4.124:8080)
Test [client-egress-l7-method]:
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-public: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-public (10.0.4.124:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-private: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-private (10.0.4.124:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-privatewith-header: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-privatewith-header (10.0.4.124:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-public: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-public (10.0.6.93:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-private: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-private (10.0.6.93:8080)
  ❌ client-egress-l7-method/pod-to-pod-with-endpoints/curl-ipv4-1-privatewith-header: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> curl-ipv4-1-privatewith-header (10.0.6.93:8080)
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-pod/curl-ipv4-2: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> cilium-test/echo-other-node-545c9b778b-pzb8d (10.0.6.93:8080)
  ❌ client-egress-l7/pod-to-pod/curl-ipv4-3: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> cilium-test/echo-same-node-965bbc7d4-s6dhj (10.0.4.124:8080)
  ❌ client-egress-l7/pod-to-world/http-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-http (one.one.one.one:80)
  ❌ client-egress-l7/pod-to-world/https-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https (one.one.one.one:443)
  ❌ client-egress-l7/pod-to-world/https-to-one.one.one.one-index-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https-index (one.one.one.one:443)
  ❌ client-egress-l7/pod-to-world/http-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-http (one.one.one.one:80)
  ❌ client-egress-l7/pod-to-world/https-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https (one.one.one.one:443)
  ❌ client-egress-l7/pod-to-world/https-to-one.one.one.one-index-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https-index (one.one.one.one:443)
Test [client-egress-l7-named-port]:
  ❌ client-egress-l7-named-port/pod-to-pod/curl-ipv4-2: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> cilium-test/echo-other-node-545c9b778b-pzb8d (10.0.6.93:8080)
  ❌ client-egress-l7-named-port/pod-to-pod/curl-ipv4-3: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> cilium-test/echo-same-node-965bbc7d4-s6dhj (10.0.4.124:8080)
  ❌ client-egress-l7-named-port/pod-to-world/http-to-one.one.one.one-0: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-http (one.one.one.one:80)
  ❌ client-egress-l7-named-port/pod-to-world/https-to-one.one.one.one-0: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https (one.one.one.one:443)
  ❌ client-egress-l7-named-port/pod-to-world/https-to-one.one.one.one-index-0: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https-index (one.one.one.one:443)
  ❌ client-egress-l7-named-port/pod-to-world/http-to-one.one.one.one-1: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-http (one.one.one.one:80)
  ❌ client-egress-l7-named-port/pod-to-world/https-to-one.one.one.one-1: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https (one.one.one.one:443)
  ❌ client-egress-l7-named-port/pod-to-world/https-to-one.one.one.one-index-1: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https-index (one.one.one.one:443)
Test [dns-only]:
  ❌ dns-only/pod-to-world/http-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-http (one.one.one.one:80)
  ❌ dns-only/pod-to-world/https-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https (one.one.one.one:443)
  ❌ dns-only/pod-to-world/https-to-one.one.one.one-index-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https-index (one.one.one.one:443)
  ❌ dns-only/pod-to-world/http-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-http (one.one.one.one:80)
  ❌ dns-only/pod-to-world/https-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https (one.one.one.one:443)
  ❌ dns-only/pod-to-world/https-to-one.one.one.one-index-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https-index (one.one.one.one:443)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/https-to-one.one.one.one-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https (one.one.one.one:443)
  ❌ to-fqdns/pod-to-world/https-to-one.one.one.one-index-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> one.one.one.one-https-index (one.one.one.one:443)
  ❌ to-fqdns/pod-to-world/http-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/https-to-one.one.one.one-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https (one.one.one.one:443)
  ❌ to-fqdns/pod-to-world/https-to-one.one.one.one-index-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> one.one.one.one-https-index (one.one.one.one:443)
  ❌ to-fqdns/pod-to-world-2/https-cilium-io-0: cilium-test/client-6b4b857d98-pw2rs (10.0.4.169) -> cilium-io-https (cilium.io:443)
  ❌ to-fqdns/pod-to-world-2/https-cilium-io-1: cilium-test/client2-646b88fb9b-fr2wz (10.0.4.242) -> cilium-io-https (cilium.io:443)
connectivity test failed: 7 tests failed

LeoShivas avatar Aug 14 '23 07:08 LeoShivas

I finally found out what is missing.

Each time I update firewalld rules, I clean the cilium test with a kubectl delete --namespace=cilium-test -f https://raw.githubusercontent.com/cilium/cilium/1.14.1/examples/kubernetes/connectivity-check/connectivity-check.yaml, reboot the nodes and re-deploy the cilium tests with the kubectl apply --namespace=cilium-test -f https://raw.githubusercontent.com/cilium/cilium/1.14.1/examples/kubernetes/connectivity-check/connectivity-check.yaml command.

Here are the cilium tests before :

[myuser@kube-cp-1 ~]$ k -n cilium-test get po
NAME                                                     READY   STATUS             RESTARTS        AGE
echo-a-6575c98b7d-k5zz6                                  1/1     Running            0               7m27s
echo-b-54b86d8976-tw87s                                  1/1     Running            0               7m26s
echo-b-host-54d5cc5fcd-92wl2                             1/1     Running            0               7m20s
host-to-b-multi-node-clusterip-846b574bbc-kh8bj          1/1     Running            0               6m47s
host-to-b-multi-node-headless-5b4bf5459f-xp9fj           1/1     Running            1 (5m40s ago)   6m46s
pod-to-a-6578dd7fbf-2hrt7                                1/1     Running            1 (6m15s ago)   7m18s
pod-to-a-allowed-cnp-57fd79848c-gqtcv                    1/1     Running            1 (6m4s ago)    7m8s
pod-to-a-denied-cnp-d984d7757-h98lp                      1/1     Running            0               7m11s
pod-to-b-intra-node-nodeport-6654886dc9-shqst            1/1     Running            1 (5m32s ago)   6m37s
pod-to-b-multi-node-clusterip-54847b87b9-c7vv6           1/1     Running            1 (5m55s ago)   7m1s
pod-to-b-multi-node-headless-64b4d78855-h67tb            1/1     Running            1 (5m45s ago)   6m48s
pod-to-b-multi-node-nodeport-64757f6d5f-44699            1/1     Running            1 (5m32s ago)   6m39s
pod-to-external-1111-76c448d975-69wdz                    1/1     Running            0               7m16s
pod-to-external-fqdn-allow-google-cnp-56c545c6b9-55grt   0/1     CrashLoopBackOff   5 (24s ago)     7m7s

After adding the 53/udp port rule to firewalld, here are the cilium tests after :

[jlnadm@kube-cp-1 ~]$ k -n cilium-test get po
NAME                                                     READY   STATUS    RESTARTS        AGE
echo-a-6575c98b7d-j2489                                  1/1     Running   0               5m24s
echo-b-54b86d8976-bnhvx                                  1/1     Running   0               5m24s
echo-b-host-54d5cc5fcd-hmfc9                             1/1     Running   0               90s
host-to-b-multi-node-clusterip-846b574bbc-jng5k          1/1     Running   0               4m38s
host-to-b-multi-node-headless-5b4bf5459f-st2f6           1/1     Running   0               4m35s
pod-to-a-6578dd7fbf-rz4kt                                1/1     Running   1 (3m56s ago)   5m16s
pod-to-a-allowed-cnp-57fd79848c-h24mp                    1/1     Running   1 (3m57s ago)   5m3s
pod-to-a-denied-cnp-d984d7757-h2sqw                      1/1     Running   0               5m7s
pod-to-b-intra-node-nodeport-6654886dc9-9gmk7            1/1     Running   3 (78s ago)     4m25s
pod-to-b-multi-node-clusterip-54847b87b9-6dl69           1/1     Running   1 (3m43s ago)   4m51s
pod-to-b-multi-node-headless-64b4d78855-pclwv            1/1     Running   1 (3m40s ago)   4m43s
pod-to-b-multi-node-nodeport-64757f6d5f-8wrx9            1/1     Running   3 (85s ago)     4m30s
pod-to-external-1111-76c448d975-sp2fz                    1/1     Running   0               5m11s
pod-to-external-fqdn-allow-google-cnp-56c545c6b9-2rfpc   1/1     Running   0               4m57s

But when I launch the cilium connectivity test again, it remains 6 failed tests (instead of 7).

Then, after adding the 80/tcp port and the 8080/tcp port rules to firewalld, the tests finally succeed :

........

✅ All 42 tests (335 actions) successful, 13 tests skipped, 1 scenarios skipped.

What next ?

Maybe the documentation should be improved to open the needed ports ? Or, maybe improve the cilium tests in order to take care of these firewall rules ?

LeoShivas avatar Aug 18 '23 14:08 LeoShivas

@LeoShivas Thanks for the report. We are happy to accept contributions to the docs regarding the connectivity test requirements. The relevant page is here. The docs can be found in the https://github.com/cilium/cilium Documentation/ directory.

christarazi avatar Aug 18 '23 18:08 christarazi

@LeoShivas , I have the same issue but i dont see firewalld service running. only iptables are running. please advise

Rammurthy5 avatar Nov 25 '23 21:11 Rammurthy5

Just a quick side note for anyone experiencing test failures. Make sure you're nodes aren't resource exhausted. I had two worker nodes running on 512mb of ram and tests failed because I ran out of memory during the test and therefore kube-scheduler wasn't able to schedule pods in cilium-test namespace. I upgraded them to 1024mb of ram and connectivity tests passed.

is-it-ayush avatar Mar 30 '24 00:03 is-it-ayush