calico
calico copied to clipboard
[v3.26] The can-reach parameter supports multiple target values
Description
Related issues/PRs
Todos
Our cluster needs to take into account a variety of unstable situations, and the cluster cannot be connected to the external network, so we want to use the parameter CAN-reach, and it can support multiple target values, so I made some changes, so far the effect is good
- [x] Tests In our cluster, All nodes have only master01-master03 resolution in /etc/hosts
In the yaml resource of calico-node I set IP_AUTODETECTION_METHOD to can-reach=master08,master09,master01
Looking at calico-node's log, it turns out that he correctly found the available IP
-
[x] Documentation You can set the value of can-reach to multiple target values similar to cidr
- name: IP_AUTODETECTION_METHOD value: can-reach=master08,master09,master01 -
[ ] Release note The can-reach parameter supports multiple target values
Release Note
TBD
@luanshuo I'm not 100% sure this change is necessary, actually - the can-reach method of detection doesn't actually require network connectivity - it doesn't send packets, it just looks up the interface that would be used if a connection were going to be established.
I suppose this would be necessary if you had a can-reach target that was unreachable based on the host routing (i.e., no routes on the host cover it) - is that what your network looks like?
I have a slight worry about allowing multiple targets for this option. Namely, that if the destination used for the lookup changes over time, it can result in the node's auto-detected IP changing unnecessarily which can cause unwanted network instability. That said, it should only happen if not all of the provided can-reach targets resolve to the same interface, so I'm not necessarily against this, I just wonder if there's a better way to satisfy your needs than using the can-reach parameter.
@caseydavenport I'm not sure if this feature is needed by everyone, but it is necessary in the scenarios we are facing.
-
In our scenario, we provide the node deletion function, which will delete the parsing entries of /etc/hosts together with the deletion. If we delete the master01 node from the cluster after using can-reach=master01, can-reach=master01 cannot find the correct IP address. If there are multiple values, can-rach=master02,master03 can be avoided.
-
we have a number of old clusters, some nodes in the cluster /etc/hosts do not have master01 parsing entries, we do not want to add a parsing action to all nodes, because these old clusters will also appear node add, delete action
So we want to be able to set multiple values so that we don't have to do too many code changes and node operations
@luanshuo I guess I just wonder if can-reach is really the right mechanism for that environment then. There are a number of other options available: https://docs.tigera.io/calico-cloud/networking/ipam/ip-autodetection#autodetection-methods
For example, specifying an interface regex or using the Kubernetes InternalIP of the node might be more appropriate here, unless all of your nodes have different interface naming?
Alternatively, I think you should be able to use can-reach with an IP address instead of using the hostnames of the nodes, so that can-reach is no longer coupled to /etc/hosts lookups at all. The IP address doesn't even need to be a real address in the network, since like I said above can-reach doesn't send traffic, it just does a local routing lookup.
Like I said - not necessarily against this PR - but I'd rather find a way with existing methods if we can, since I think specifying multiple can-reach options has some corner-cases that could result in instability when node IPs / interfaces change.
@caseydavenport We didn't just try can-reach, in fact we started with interface=eth0 to interface=^e.*, and then we used cidr=10.10.10.0/24, and other ways, and we tried all of that. However, because of the complexity of the scenario we are faced with, these options can only cover part of the story
This includes saying that can-reach specifies an IP, which we also tested, but as I said before our scenario is complex enough
- The same cluster may have different NIC names, eth0, ens33, bound0, or others
- In the same cluster, the egress IP addresses of nodes are on different network segments
- Multiple nics exist on the same node
- any node should take into account the IP switching scenario
- The cluster may not be connected to the external network
- In the same cluster, an entry may not coexist on /etc/hosts of different nodes
- ...
So we still hope that CAN-reach can support multiple values, which will solve a lot of our problems
Not sure if you're still interested in this one @luanshuo - there are two more comments open.
Closing due to inactivity, but happy to reopen if you pick this up again.