Ability to configure interface used for Flannel
Feature Request
Summary
Add a Talos configuration option that controls the -iface flag passed to flanneld, in case the interface through which other cluster nodes are reachable is not the trivial default.
Motivation
To my understanding, flannel selects one of the node's network interfaces as the one that traffic destined for other nodes should be routed through. It can be set on the flanneld command line, but the fallback is to use the same interface as the node's default route.
I have cluster in which some of the nodes are behind NAT, so naturally I've set up Wireguard between them. Now each node has an eth0 interface with access to the internet (via NAT), and a wireguard interface with direct access to all other nodes in the cluster. It seems sensible to me that the default route should be set via the ethernet interface, which a specific route to the node IP subnet via the wireguard interface.
Flannel is therefore selecting the ethernet interface for node-to-node communication, but the other nodes aren't routable through this interface. Being able to add -iface wg0 to flanneld's command line would solve this problem.
I've worked around using kubectl edit on the flannel daemonset, but I think it would be in the spirit of Talos Linux for this to be configurable through Talos.
Alternative solutions
An alternative approach could be to make flannel more intelligent when it comes to selecting the interface, e.g. by looking up every node IP in the system route table. However this seems a lot more complex and I'm not sure how to handle the scenario where not every node IP is routed to the same interface.
not directly related to the question, but sounds like KubeSpan might do exactly what you're looking for?
Talos provides default simple Flannel configuration which can be replaced by any other CNI: https://www.talos.dev/v1.1/reference/configuration/#clusternetworkconfig
not directly related to the question, but sounds like KubeSpan might do exactly what you're looking for?
TIL - this looks pretty neat, although I don't know if it solves my use case since there are hosts I'd like to join to wireguard that don't run kubernetes (so they can access the apiserver for management and/or the applications running inside kubernetes). But I'd definitely like to see how KubeSpan doesn't hit the same flannel interface problem.
Talos provides default simple Flannel configuration which can be replaced by any other CNI: https://www.talos.dev/v1.1/reference/configuration/#clusternetworkconfig
Yeah, this is what I'll do for now (better than kubectl edit at least).
Thanks!
KubeSpan takes over Flannel traffic and sends it directly to Wireguard. It doesn't support though joining other non-Talos nodes to the Wireguard network.
On other hand, you could have an 'exit' node which does Wireguard for the users, while KubeSpan handles in-cluster traffic.