kubeone
kubeone copied to clipboard
Traffic seems to be routed via nodes' public IPs instead of private IPs
What happened:
I have a kubeone cluster set-up at Hetzner via the example terraform scripts which include a private network. The only change we have is to add worker pools for a list of datacenters:
variable "datacenters" {
type = list(string)
default = ["nbg1", "fsn1"]
}
output "kubeone_workers" {
description = "Workers definitions, that will be transformed into MachineDeployment object"
value = {
for idx, datacenter in var.datacenters:
# following outputs will be parsed by kubeone and automatically merged into
# corresponding (by name) worker definition
"${var.cluster_name}-pool${idx + 1}" => {
replicas = var.workers_replicas
providerSpec = {
sshPublicKeys = [file(var.ssh_public_key_file)]
operatingSystem = var.worker_os
operatingSystemSpec = {
distUpgradeOnBoot = false
}
cloudProviderSpec = {
# provider specific fields:
# see example under `cloudProviderSpec` section at:
# https://github.com/kubermatic/machine-controller/blob/master/examples/hetzner-machinedeployment.yaml
serverType = var.worker_type
location = datacenter
image = var.image
networks = [
hcloud_network.net.id
]
# Datacenter (optional)
# datacenter = ""
labels = {
"${var.cluster_name}-workers" = "pool1"
}
}
}
}
}
}
The resulting nodes look like this:
# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
staging-control-plane-1 Ready control-plane,master 29d v1.20.6 192.168.0.3 195.201.XXX.XXX Ubuntu 20.04.2 LTS 5.4.0-72-generic docker://19.3.14
staging-control-plane-2 Ready control-plane,master 29d v1.20.6 192.168.0.5 162.55.XXX.XXX Ubuntu 20.04.2 LTS 5.4.0-72-generic docker://19.3.14
staging-control-plane-3 Ready control-plane,master 29d v1.20.6 192.168.0.4 195.201.XXX.XXX Ubuntu 20.04.2 LTS 5.4.0-72-generic docker://19.3.14
staging-pool1-5d679cf75-464fm Ready <none> 16h v1.20.6 192.168.0.9 195.201.XXX.XXX Ubuntu 20.04.2 LTS 5.4.0-72-generic docker://19.3.15
staging-pool2-84c786cf67-dxf9p Ready <none> 17h v1.20.6 192.168.0.7 162.55.XXX.XXX Ubuntu 20.04.2 LTS 5.4.0-72-generic docker://19.3.15
When I traceroute a kubernetes service (e.g. backend.default.svc.cluster.local) I see that the traffic is routed via the public IP of the nodes instead of the IP within the private network:
# kubectl run ubuntu --image ubuntu -- sleep infinity
# kubectl exec -it ubuntu -- bash
root@ubuntu:/# traceroute backend.default.svc.cluster.local
traceroute to backend.default.svc.cluster.local (10.110.120.137), 30 hops max, 60 byte packets
1 static.XXX.XXX.55.162.clients.your-server.de (162.55.XXX.XXX) 0.252 ms 0.052 ms 0.015 ms
2 172.31.1.1 (172.31.1.1) 11.795 ms 11.217 ms 11.723 ms
3 11202.your-cloud.host (159.69.96.89) 0.272 ms 0.410 ms 0.379 ms
4 * * *
5 spine1.cloud2.fsn1.hetzner.com (213.239.225.41) 0.954 ms 1.292 ms 1.262 ms
6 core23.fsn1.hetzner.com (213.239.239.137) 3.423 ms core23.fsn1.hetzner.com (213.239.239.125) 2.076 ms core24.fsn1.hetzner.com (213.239.239.133) 4.113 ms
7 core11.nbg1.hetzner.com (213.239.245.225) 6.156 ms core11.nbg1.hetzner.com (213.239.203.125) 10.360 ms 6.085 ms^C
Where 162.55.XXX.XXX is the public IP of the node. I'd expect the traffic being sent to 192.168.0.7 instead. I confirmed on a GKE cluster and there it seems that traffic is routed via the private IPs.
As a consequence, if I apply a firewall which prevents access to the nodes' public IPs, the cluster networking becomes non-operational in a sense that DNS lookups no longer work and services cannot be reached.
What is the expected behavior:
In-cluster traffic should be routed via private IPs and not via public IPs. I should also be able to restrict public node IP access via firewall and the cluster should stay operational.
How to reproduce the issue:
I did not try it with a fresh install, but steps to reproduce should be:
- Install kubeone on Hetzner with the default terraform templates
- Create some default pod with a service (nginx should suffice)
- Create a second pod (e.g. ubuntu) and traceroute the service created in 2).
Anything else we need to know?
Information about the environment:
KubeOne version (kubeone version): Cluster was created with kubeone 1.2.1 but was updated to 1.2.2 and then 1.2.3 recently. MachineDeployments have been restarted via https://docs.kubermatic.com/kubeone/master/cheat_sheets/rollout_machinedeployment/
Operating system: Ubuntu 20.04.2 LTS
Provider you're deploying cluster on: Hetzner
Operating system you're deploying on: MacOS
Hope you can help me with that! Thank you a lot!
I'm not sure if cross datacenter traffic can be sent over the private IPs. I suppose that question should be directed at hetzner cloud themselves.
OK, I've tried to create VMs in different DCs and they are capable to communicate to each other over the private networking.
@namelessvoid can you please try to build kubeone using latest master and try it?
I'm getting different results
root@ubuntu:/# traceroute 10.244.7.2
traceroute to 10.244.7.2 (10.244.7.2), 30 hops max, 60 byte packets
1 static.123.164.55.162.clients.your-server.de (162.55.164.123) 0.156 ms 0.066 ms 0.073 ms
2 10.244.7.0 (10.244.7.0) 4.442 ms 4.234 ms 4.131 ms
3 10.244.7.2 (10.244.7.2) 4.282 ms 4.042 ms 3.844 ms
where 10.244.7.2 is overlay IP of the pod running on the other datacenter.
Maybe I'm getting it wrong but shouldn't the first hop be the virtual network IP of your node? 162.55.164.123 is the public IP, isn't it? Disclaimer: I'm not too deep into k8s networking 🙈
I'll try latest master as soon as I can (I'm a bit tied by releases right now).
Just for completeness, I tried a fresh cluster installed with kubeone 1.2.3 and see these results:
1 static.170.210.55.162.clients.your-server.de (162.55.210.170) 0.147 ms 0.033 ms 0.024 ms
2 172.31.1.1 (172.31.1.1) 13.458 ms 13.301 ms 13.008 ms
3 11685.your-cloud.host (195.201.67.143) 0.607 ms 0.478 ms 0.515 ms
Then I built kubeone from master and retried on another freshly installed cluster:
root@ubuntu:/# traceroute nginx.default.svc.cluster.local
traceroute to nginx.default.svc.cluster.local (10.103.180.121), 30 hops max, 60 byte packets
1 static.97.89.201.138.clients.your-server.de (138.201.89.97) 0.062 ms 0.027 ms 0.022 ms
2 172.31.1.1 (172.31.1.1) 14.458 ms 14.358 ms 14.322 ms
3 12740.your-cloud.host (136.243.181.165) 0.512 ms 0.447 ms 0.395 ms
kubeone version for the self-built one shows
{
"kubeone": {
"major": "1",
"minor": "2",
"gitVersion": "v1.2.0-rc.0-65-gab496ef",
"gitCommit": "ab496efdaa222e92f14a1d0cbe63149d57f8cc53",
"gitTreeState": "",
"buildDate": "2021-06-22T11:48:09+02:00",
"goVersion": "go1.16.5",
"compiler": "gc",
"platform": "darwin/amd64"
},
"machine_controller": {
"major": "1",
"minor": "30",
"gitVersion": "v1.30.0",
"gitCommit": "",
"gitTreeState": "",
"buildDate": "",
"goVersion": "",
"compiler": "",
"platform": "linux/amd64"
}
}
Test setup:
$ kubectl run nginx --image nginx
$ kubectl expose pod nginx --port 80
$ kubectl run ubuntu --image ubuntu -- sleep infinity
$ kubectl exec -it ubuntu -- bash
# apt update && apt install traceroute -y
# traceroute nginx.default.svc.cluster.local
Did a third test by installing the cluster from the example terraform files.
Kubeone manifest looks like this:
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.6'
cloudProvider:
hetzner: {}
external: true
addons:
enable: true
path: "./addons"
For the test, ./addons was empty.
I tried both, a cluster with a single worker node and a cluster with two worker nodes. The traceroute results remain the same, traffic is routed via public IPs.
I'm attaching some screens from the networking of the Hetzner Cloud Console. This should be setup correctly, shouldn't it?

I'm happy for any ideas for further debugging! Thank you a lot! :)
Ok, maybe I found something - sorry for not thinking about this earlier!
When I traceroute the pod IP as you did, @kron4eg, I also see the traffic using the overlay IP:
$ traceroute 10.244.8.36
traceroute to 10.244.8.36 (10.244.8.36), 30 hops max, 60 byte packets
1 static.XXX.XXX.XXX.162.clients.your-server.de (162.XXX.XXX.XXX) 0.132 ms 0.033 ms 0.021 ms
2 10.244.8.0 (10.244.8.0) 3.768 ms 3.641 ms 3.543 ms
3 10-244-8-36.nginx.default.svc.cluster.local (10.244.8.36) 3.622 ms 3.497 ms 3.490 ms
I'm still confused, though, why the public IP shows up in the trace.
But when accessing the service exposing the very same pod, it seems to take the public route again:
$ traceroute 10.109.255.202
traceroute to 10.109.255.202 (10.109.255.202), 30 hops max, 60 byte packets
1 static.XXX.XXX.XXX.162.clients.your-server.de (162.55.166.14) 0.080 ms 0.039 ms 0.022 ms
2 172.31.1.1 (172.31.1.1) 10.880 ms 9.905 ms 10.592 ms
3 11202.your-cloud.host (159.69.96.89) 0.447 ms 0.332 ms 0.320 ms
4 * * *
5 spine2.cloud2.fsn1.hetzner.com (213.239.225.45) 1.018 ms spine1.cloud2.fsn1.hetzner.com (213.239.225.41) 0.958 ms spine2.cloud2.fsn1.hetzner.com (213.239.225.45) 1.263 ms
6 core23.fsn1.hetzner.com (213.239.239.137) 13.665 ms 2.714 ms core24.fsn1.hetzner.com (213.239.239.129) 4.106 ms
7 core11.nbg1.hetzner.com (213.239.203.125) 7.735 ms core12.nbg1.hetzner.com (213.239.203.121) 10.383 ms core11.nbg1.hetzner.com (213.239.203.125) 16.566 ms
...
So maybe some setting for the service overlay is not correct?
@kron4eg Could you maybe retry this on your end to confirm this? Thank you a lot!
I'll try to reproduce
@namelessvoid I still can't replicate that behaviour (using master build). Could you please attach your manifests (workloads/services/etc)?
@kron4eg Sorry for the late response, got some stuff in may way in between...
There is nothing special, I believe:
apiVersion: v1
kind: Pod
metadata:
labels:
run: nginx
name: nginx
namespace: default
spec:
containers:
- image: nginx
name: nginx
---
apiVersion: v1
kind: Service
metadata:
labels:
run: nginx
name: nginx
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
run: nginx
type: ClusterIP
I can confirm this issue.
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
t1-control-plane-1 Ready control-plane,master 80m v1.21.3 10.8.0.2 188.34.X.X Ubuntu 20.04.2 LTS 5.4.0-77-generic containerd://1.4.8
t1-pool1-54f9cd8694-drz4m Ready <none> 77m v1.21.3 10.8.0.3 162.55.X.X Ubuntu 20.04.2 LTS 5.4.0-77-generic containerd://1.4.8
Testing with the manifests @namelessvoid provided in their last post:
root@ubuntu:/# traceroute 10.244.1.2
traceroute to 10.244.1.2 (10.244.1.2), 30 hops max, 60 byte packets
1 static.103.165.55.162.clients.your-server.de (162.55.X.X) 0.100 ms 0.030 ms 0.065 ms
2 10-244-1-2.nginx.default.svc.cluster.local (10.244.1.2) 0.223 ms 0.063 ms 0.069 ms
The first hop (162.55.X.X) is the external IP of the node. That should be 10.8.0.3 instead.
EDIT: OK, I suppose it was a false alarm. Pods keep talking to each other even though I'm now blocking all external traffic to the nodes. I'm still confused that the external IP shows up in the traceroute, though.
The first hop (162.55.X.X) is the external IP of the node
Is own IP of the node. This IP is the default route for pods.
Can we somehow configure the internal IP to be the node's IP? Yesterday I said
Pods keep talking to each other even though I'm now blocking all external traffic to the nodes.
but that is only true if I use the SDN firewall provided by Hetzner. When I use iptables on the nodes to block all incoming traffic via the interface eth0, the pods can't communicate anymore.
I'd actually like to be able to disable the public interface completely. Is that somehow feasible with kubeone?
@Lykos153 I support it can be achieved by using custom images.
I can now say for sure that DNS traffic is still routed via the public interface. With all incoming public connections blocked, pods can reach each other via IP but not via service hostnames. Also, every request from pods to the internet has a ~5s delay due to DNS timeout. The cluster is not usable unless I open ports 9*53 on the public network. I'm gonna try to get rid of the public interface using a custom image as you suggested. The issue remains, however.
Any update here? We have the same issue.
We need to whitelist the public ip-ranges as trusted ip's is our ingress to make the proxy protocol to work.
Same issue, makes firewalling horrible. Have manually patched kubeconfigs to use private IP... Maybe can override kubeadm args somewhere
@alam0rt did it helped?
@alam0rt did it helped?
It helps, but it gets overridden on upgrade as the kubeadm config is regenerated.
For the time being I am just adding the public IPs to the rules using
data "hcloud_servers" "nodes" {
with_selector = "role=node"
}
locals {
node_public_ipv4 = [for node in data.hcloud_servers.nodes.servers : join("/", [node.ipv4_address, "32"])]
}
The admin kubeconfig is generated using value from terraform output kubeone_api. By default this value is public IP of the kubeapi loadbalancer. I don't see if hcloud_load_balancer can give you internal IP.
output "kubeone_api" {
description = "kube-apiserver LB endpoint"
value = {
endpoint = hcloud_load_balancer.load_balancer.ipv4
apiserver_alternative_names = var.apiserver_alternative_names
}
}
The admin kubeconfig is generated using value from terraform output
kubeone_api. By default this value is public IP of the kubeapi loadbalancer. I don't see ifhcloud_load_balancercan give you internal IP.output "kubeone_api" { description = "kube-apiserver LB endpoint" value = { endpoint = hcloud_load_balancer.load_balancer.ipv4 apiserver_alternative_names = var.apiserver_alternative_names } }
There definitely is a private IP that can be used. I'll give it a go soon and see what happens.
So, it looks like you can use
value = {
endpoint = hcloud_load_balancer.load_balancer.network_ip
}
}
network_ip is defined here: https://github.com/hetznercloud/terraform-provider-hcloud/blob/d6f4207b2b75b76e007bd08602e6dcbfb1740032/internal/loadbalancer/resource.go#L406
but is apparently undocumented!
OK, having the INTERNAL IP as kube-api endpoint means that kubeconfigs for whole system will contain that IP. Including admin config. Kubeone will work around that, not issue (we always tunnel kube-apiserver requests via ssh).
However your local kubectl might have a problem, but worry not kubeone proxy to the rescue! kubeone proxy will create a pass through ssh-tunnel proxy, that kubectl can easily leverage with export HTTPS_PROXY=http://....
Speaking of which, is there a good way to regenerate all of the kubeconfigs ? I have updated the terraform output and ran kubeone apply --manifest kubeone.yaml -t new.json but I don't think anything is updated. Maybe I need to force upgrade?
No, I don't think so it's possible, at least no under kubeadm. You'd need to create a new cluster.
Damn! New cluster it is I guess.
I mean, it can be done manually, but it's highly possible to kill your cluster. But if you'd like to try, here's how:
- You'd need to regenerate certificates for kube-apiserver, with new SAN list that will include internal IP of the loadbalancer.
- Then replace all the kubelet's kubeconfigs to point to this new LB
- Then replace kubeproxy config in the configMap and restart all the kube-proxies across the cluster and pray the cluster is not dead after this
But I highly recommend not doing this in the cluster that has anything valuable running under it.
Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale Docs are still pending.