cilium-cli
cilium-cli copied to clipboard
connectivity test: "pod-to-world" case fails with HTTP 403 response from cloudflare
Bug report
My team's e2e tests (which include cilium connectivity tests) experience some flaky behavior around the pod-to-world
test case that look like:
� ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null https://one.one.one.one:443" failed: command terminated with exit code 22
ℹ️ curl output:
3 curl: (22) The requested URL returned error: 403
'10.244.2.92:41936 -> 1.0.0.1:443 = 403
B 📄 No flows recorded during action https-to-one-one-one-one-0
B 📄 No flows recorded during action https-to-one-one-one-one-0
� [.] Action [no-policies/pod-to-world/https-to-one-one-one-one-index-0: cilium-test/client-6488dcf5d4-4fjhn (10.244.2.92) -> one-one-one-one-https-index (one.one.one.one:443)]
� ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null https://one.one.one.one:443/index.html" failed: command terminated with exit code 22
ℹ️ curl output:
3 curl: (22) The requested URL returned error: 403
'10.244.2.92:45752 -> 1.1.1.1:443 = 403
I suspect the issue here is that the cloud provider (Digitalocean) worker node IPs get flagged as abusive by cloudflare and automatically blocked by their firewall. I can imagine two possible solutions here:
- allow the "world" target to be configurable (https://github.com/cilium/cilium-cli/issues/222)
- accept http 403 responses if it can be determined that this status is actually coming from a target outside the cluster (since to my understanding this should still validate connectivity between pod and world). (https://github.com/cilium/cilium-cli/issues/174)
General Information
- Cilium CLI version (run
cilium version
): 0.9.1 - Orchestration system version in use (e.g.
kubectl version
, ...): ? - Platform / infrastructure information (e.g. AWS / Azure / GCP, image / kernel versions): Digitalocean
How to reproduce the issue
- Create Digitalocean cluster: https://docs.digitalocean.com/products/kubernetes/quickstart/
- Get kubeconfig, set
KUBECONFIG
environment variable: https://docs.digitalocean.com/products/kubernetes/how-to/connect-to-cluster/ - Run Cilium 0.9.1 connectivity tests:
cilium connectivity test
In order to reproduce this it may be necessary to repeatedly create and delete worker node pools until a worker node with an IP from the DO ASN that is flagged by cloudflare as abusive.