ZeroTierOne icon indicating copy to clipboard operation
ZeroTierOne copied to clipboard

Withdraw managed route when member down

Open arhue opened this issue 3 years ago • 6 comments

I have Zerotier deployed on 2 routers and have same managed route to them since I want the other one to take over when 1 fails. But when I take down one of the routers, the managed route is not withdrawn.

Screenshot 1: Member down. zerotier-webui2

Screenshot 2: Route on Zerotier UI still active. zerotier-webui

Screenshot 3: Route on Windows still pointing to member which is down, even after restart. 2022-08-12_11-49-27

  • What you expect to be happening.

Route to be automatically removed when member goes offline.

  • What is actually happening?

Route is not getting removed when member goes offline.

  • Any steps to reproduce the error.

Setup Zerotier on 2 member nodes with same managed route to them. Then turn off one of them.

  • Any relevant console output or screenshots.

See above.

  • What operating system and ZeroTier version. Please try the latest ZeroTier release.

Zerotier version 1.6.6 on Mikrotik routers. Zerotier 1.10.1 on Windows host.

arhue avatar Aug 12 '22 16:08 arhue

I am working around the issue with VRRP and it seems to be working well, but having failover at L3 in Zerotier should be much better.

arhue avatar Aug 12 '22 18:08 arhue

This seems like something that would also be valuable with a priority, so you could say 198.168.192.80 is the primary (say priority 10), and 192.168.192.217 is the failover (priority 0). ZeroTier would then only ever keep the highest priority route (of the online nodes).

DarkArc avatar Aug 24 '23 02:08 DarkArc

One challenge is ZeroTier itself doesn't know if a node is up or down. It's a peer to peer system. Down according to who? I think VRRP, OSPF, etc are a good work-around.

laduke avatar Aug 24 '23 17:08 laduke

Hm... Last night I was thinking this might be useful even if the failover was in a peer-to-peer scope (e.g., peer A sees that its route to X via peer B is unavailable because there's not a working connection to peer B, so peer A switches over to peer C).

In the morning light, VRRP and OSPF might lead to a better result than what ZeroTier can directly provide anyways (because the routing of the remote LAN into ZeroTier -- assuming this isn't just a masquerade -- would need to know which node is up). That said, I've never used those tools, and it's not super clear how to make such a solution work -- do you run two ZeroTier nodes that impersonate each other but only one is ever on? How do you make that happen with VRRP? etc.


This might be something worth writing a guide for. As a general point of feedback, I think ZeroTier while often superior in function is lacking guides. e.g., https://tailscale.com/kb/1115/subnet-failover/ describes how this can be accomplished with TailScale -- in a masquerading setup, which granted, is not what I want. This is closer to what I want (and have running on ZeroTier): https://tailscale.com/kb/1214/site-to-site/ ... however I'd like failover (because of unforeseen circumstances like https://github.com/zerotier/ZeroTierOne/issues/2105).

i.e., it seems one quality option would be to have a guide introducing a site-to-site setup and a guide on using site-to-site setup with a failover (and any ZeroTier specific integrations/pieces).

Another might be, having this functionality built in for the ZeroTier side (as it seems to be in TailScale) and requiring (for site-to-site setups) whatever router is there be responsible for managing its own routing table.

DarkArc avatar Aug 24 '23 17:08 DarkArc

iBGP and a route reflector or two (or route server if using eBGP) would be one way of doing it but a node would have to be designated as the reflector which means SPOF so would ideally need two of them. Also the hassle of configuration. If you use things like peer groups and BGP listen range then it simplifies the config somewhat.

bodleytunes avatar Feb 15 '24 22:02 bodleytunes

see also #2223

laduke avatar Feb 16 '24 17:02 laduke