tectonic-installer
tectonic-installer copied to clipboard
Kube-flannel may configure Flannel with a stale interface IP
Versions
- Tectonic version (release or commit hash): bb1007decae57b4d933734a13d550587fa2d9c45
- Terraform version (
terraform version
): 0.10.8 - Platform (aws|azure|openstack|metal): digitalocean
What happened?
Flannel pods fail on account of not finding their corresponding network interfaces. It seems to me this happens because kube-flannel configures Flannel with an interface IP that can be stale.
From debugging I found that failures were caused by Flannel being configured with an interface IP corresponding to nodes from previous clusters of mine, so I'm gonna guess that kube-flannel's $(POD_IP)
variable is determined from DNS lookups against nodes. If the DNS lookups return cached/stale values, Flannel will be configured wrongly as seems to happen in my case.
What I'm wondering is why it's necessary to configure Flannel's --iface
option explicitly, instead of letting it determine it automatically? I don't know how Flannel's automatic interface detection works, but hopefully it's less fragile than the current solution, which seems to break when DNS lookups return stale IPs.
Hey @aknuds1 making it explicit ensures consistency and robustness across different platforms and environments, the POD_IP value comes directly from the ip assigned to the pod at creation time by kubernetes https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/ so I can't see how it can be stale
@enxebre Consider what I said initially: "I'm gonna guess that kube-flannel's $(POD_IP) variable is determined from DNS lookups against nodes. If the DNS lookups return cached/stale values, Flannel will be configured wrongly as seems to happen in my case".
So, it doesn't seem to work as well in practice as you seem to think, unfortunately. I have seen myself that $POD_IP
is stale, so there's no question about that. I'm not sure how it happens, but like I said my theory is that it's because of DNS lookups returning stale values.
There is also the question if the method of using Flannel's --iface
flag works better than Flannel's own automatic detection, that is my question here. I see that the current method breaks, so we should investigate if letting Flannel detect automatically works better.