vyos-1x
vyos-1x copied to clipboard
routing: T1237: Add new feature failover route
Change Summary
The failover route allows installing static routes to the kernel routing
table only if the required target or gateway is alive
When the target or gateway doesn't respond to ICMP/ARP checks this route
deleted from the routing table
Routes are marked as protocol failover (rt_protos)
cat /etc/iproute2/rt_protos.d/failover.conf
111 failover
ip route add 203.0.113.1 metric 2 via 192.0.2.1 dev eth0 proto failover
$ sudo ip route show proto failover
203.0.113.1 via 192.0.2.1 dev eth0 metric 1
So we can safely flush such routes
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Code style update (formatting, renaming)
- [ ] Refactoring (no functional changes)
- [ ] Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
- [ ] Other (please describe):
Related Task(s)
- https://phabricator.vyos.net/T1237
Component(s) name
failover, route
Proposed changes
How to test
VyOS configuration:
set protocols failover route 203.0.113.1/32 next-hop 192.168.100.1 check target '192.168.100.1'
set protocols failover route 203.0.113.1/32 next-hop 192.168.100.1 check timeout '10'
set protocols failover route 203.0.113.1/32 next-hop 192.168.100.1 check type 'icmp'
set protocols failover route 203.0.113.1/32 next-hop 192.168.100.1 interface 'eth1'
set protocols failover route 203.0.113.1/32 next-hop 192.168.100.1 metric '2'
Check service
vyos@r14# sudo systemctl status vyos-failover
● vyos-failover.service - Failover route service
Loaded: loaded (/etc/systemd/system/vyos-failover.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2022-06-13 19:53:23 EEST; 4s ago
Main PID: 6515 (python3)
Tasks: 1 (limit: 4695)
Memory: 6.1M
CPU: 35ms
CGroup: /system.slice/vyos-failover.service
└─6515 /usr/bin/python3 /usr/libexec/vyos/vyos-failover.py --config /run/vyos-failover.conf
Check routing table
vyos@r14:~$ show ip route | grep 203
K>* 203.0.113.1/32 [0/2] via 192.168.100.1, eth1, 00:01:00
vyos@r14:~$
vyos@r14:~$ sudo ip route show proto failover
203.0.113.1 via 192.168.100.1 dev eth1 metric 2
vyos@r14:~$
Deleting protocols failover must delete all routes proto failover
vyos@r14# delete protocols failover
[edit]
vyos@r14# commit
[edit]
vyos@r14# sudo ip route show proto failover
[edit]
vyos@r14#
Checklist:
- [x] I have read the CONTRIBUTING document
- [x] I have linked this PR to one or more Phabricator Task(s)
- [ ] I have run the components SMOKETESTS if applicable
- [x] My commit headlines contain a valid Task id
- [x] My change requires a change to the documentation
- [ ] I have updated the documentation accordingly
I wonder if we can merge this PR with the wan load-balance functionality for VyOS 1.4 - Other vendors refer to such a feature like IP SLA. I for myself find the CLI notation a big clumsy, but I have yet no better idea, sorry.
@sever-sever very cool, this is already a much better solution to the current WLB implementation. Two things relating to timing though: The timeout parameter might be better named "interval" to reflect the way it works behind the scenes. And this parameter does not seem to be used for TCP checks, the timeout there is fixed at 2 seconds, so if I were to configure a 1 second timeout none of the other checks are going to happen for a full 3 seconds (if it's down), as is the case with many routes with long timeouts. A quick(ish) workaround would be to use instance parameters on a templated systemd job, so one service instance per route. Clean up is straightforward since it's "kill everything I don't know about", "(re)start what I do know about".
I would also consider table and vrf parameters, especially table is particularly useful for PBR where you could have consumer bulk traffic uplinks for general internet traffic but allow failover to a priority link if the former are down, in table 10, the reverse of this being in table 20 for the actual priority traffic.
As mentioned in the other issue there is the dpinger tool that could be added in along with this and that would add even more functionality. I have done the basic test of seeing if it compiles on vyos and indeed it does and functions as expected.
@c-po I think we should call it SLA if the above can be implemented as from my understanding IP SLA implies measurement of link quality not just possibility of routing. Being able to run show ip sla statistics on vyos would be next level 👀
Hello. Do you plan to use this mechanism for WAN failover?