kube-router
kube-router copied to clipboard
Pods with hostNetwork=true can't connect to Kube API Server
What happened?
When I deploy kube-router with all features (without kube-proxy) and I deploy traefik with hostNetwork=true, the pod can't reach https://10.96.0.1:443/versions with timeouts exception.
What did you expect to happen?
When the pod is deployed with hostNetwork=true, it can reach the Kube API server. This works disabling service proxy in kube-router and enabling kube-proxy, but I want to use only kube-router.
How can we reproduce the behavior you experienced? Steps to reproduce the behavior:
- Step 1
Create a k8s cluster (in my case with k0s)
- Step 2
Deploy kube-router with all features
- Step 3
Deploy traefik with hostNetwork=true. It can't reach Kube API Service
**Screenshots / Architecture Diagrams / Network Topologies ** If applicable, add those here to help explain your problem.
Traefik logs:
k0s kubectl logs adparts-adgest-traefik-ff44b5bdb-h544b -n adparts
time="2024-04-27T18:10:09Z" level=info msg="Configuration loaded from flags."
time="2024-04-27T18:10:21Z" level=error msg="Error watching kubernetes events: could not retrieve server version: Get \"https://10.96.0.1:443/version\": net/http: TLS handshake timeout" providerName=kubernetes
time="2024-04-27T18:10:22Z" level=error msg="Provider connection error: could not retrieve server version: Get \"https://10.96.0.1:443/version\": net/http: TLS handshake timeout; retrying in 519.955504ms" providerName=kubernetes
time="2024-04-27T18:10:52Z" level=error msg="Error watching kubernetes events: could not retrieve server version: Get \"https://10.96.0.1:443/version\": dial tcp 10.96.0.1:443: i/o timeout" providerName=kubernetes
time="2024-04-27T18:10:53Z" level=error msg="Provider connection error: could not retrieve server version: Get \"https://10.96.0.1:443/version\": dial tcp 10.96.0.1:443: i/o timeout; retrying in 390.195589ms" providerName=kubernetes
time="2024-04-27T18:11:24Z" level=error msg="Error watching kubernetes events: could not retrieve server version: Get \"https://10.96.0.1:443/version\": dial tcp 10.96.0.1:443: i/o timeout" providerName=kubernetes
time="2024-04-27T18:11:25Z" level=error msg="Provider connection error: could not retrieve server version: Get \"https://10.96.0.1:443/version\": dial tcp 10.96.0.1:443: i/o timeout; retrying in 468.528865ms" providerName=kubernetes
** System Information (please complete the following information):**
- Kube-Router Version (
kube-router --version): v2.1.0 - Kube-Router Parameters: all the default params in all-features-daemonset.yaml
- Kubernetes Version (
kubectl version) : 1.29 - Cloud Type: On Premise
- Kubernetes Deployment Type: K0S
- Kube-Router Deployment Type: DaemonSet
- Cluster Size: 4 nodes
** Logs, other output, metrics ** Please provide logs, other kind of output or observed metrics here.
Additional context
As I can see in iptables, kube-proxy creates KUBE-SERVICES chain, and allows in there 10.96.0.1 in port 443, but kube-proxy hasn't got some similar chain /rule
Thanks in advance
I don't run k0s so it might be a while before I can find time to setup an environment and test it myself.
However, it sounds like kube-router may not be starting, and as such it is not creating the kube-apiserver ClusterIP that traffik needs. I would imagine that this would most likely come about from kube-router not being able to talk to the kube-apiserver.
Are you able to see any logs from kube-router?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stale for 5 days with no activity.