Workaround for stupid systemd-networkd behaviour
I know issued connected to this have been discussed before, but could keepalived maybe better handle systemd-networkd deleting things on reload?
In particular
- reinstating VIPs that get deleted (it does notice, so why not reinstall them right away?)
- routes (not sure if it notices)
- routing rules (this one it doesn't notice and debugging it was not fun)
- whatever else it installs (at least in the network stack)
Unfortunately, systemd-networkd is not only the engine behind most other stuff (netplan, networkmanager), but also the most featureful network manager if one needs stuff like vlan aware bridges, routing rules, special network settings (and it is somewhat declarative in its behaviour which is nice).
It would be better for systemd-networkd to allow fixing this (it already somewhat does for VIPs but rules just disappear on me), but that's not feasible (I would file a bug in their GitHub but Lennart banned me for making a good argument years ago and it would get ignored anyway because they know better).
Feel free to include a derogatory log message aimed at systemd when keepalived fixes stuff in this instance :-)
Thanks.
I have run some tests and all of ip addresses, routes and routeing rules being deleted are detected by keepalived. If such an event occurs (and it shouldn't because the addresses, routes and rules are keepalived's and not anyone elses), then keepalived will revert to backup state, and almost certainly then become master again (the exception would be if there is a higher priority VRRP instance that was held back from becoming master due to nopreempt being configured). The code was written like this since it was much simpler to handle the reinstatement of the addresses/routes/rules by using existing code for backup to master transition rather than add explicit code to handle each individual deletion and reinstatement.
There really is no excuse for any other process, whether it be systemd-networkd or not, to delete addresses, routes or rules that do not belong to it (in other words it did not create).
A while ago we requested, and had allocated, a routeing protocol identifier allocated for keepalived (value 18) and all routes and rules installed by keepalived are specified with that protocol id (see the description of protocol in the ip-rulte(8) man page). Unfortunately there is no equivalent for ip addresses.
@zviratko Can you please provide some specific examples of the problems you are experiencing. In other words, provide your keepalived configuration files, along with what actions are happening/commands being executed that cause the problem, what impact it has on keepalived, and ideally the keepalived log entries at the time.
I see (and understand). Some docs talk about "reinstating" addresses and routes, but I wasn't able to confirm whether it was really implemented that way.
Interesting note about nopreempt - I have it set, so that my firewalls don't flip/flop (it should stick to last healthy node). Not sure what the correct setup for that is then? Keepalived for sure either doesn't notice a rule missing or didn't transition to BACKUP due to my misconfiguration.
I'm not sure I can provide anything truly reproducible, except trying to delete something by hand (which I'm willing to do one day during maintenance, this is in production). Sometimes when a VM goes up/down and its interface is deleted, or when I do "networkctl reload", or maybe on full moon, systemd-networkd justdecides to delete something, keepalived usually transitioned to BACKUP, this was probably the first time it didn't yet a crucial ip rule was missing.
Over time, I added: KeepConfiguration=yes To all my .network files This kept it from deleting VIPs from the interfaces
Now I also added ManageForeignRoutingRoutes=no and ManageForeignRoutingPolicyRules=no
to systemd-networkd config, which should prevent it from deleting routes and rules.
Unfortunately there is no equivalent for ip addresses. You could make your own IP scope :-) but systemd-networkd would delete it anyway.
The weird thing is, that sometimes it (networkd) just doesn't do that and everything works. Sometimes it goes crazy. But that's not really an issue for this repository (it is too civilized for this debate).
I know the "right" thing to do is to boycott systemd or at least not use the networkd component, but it's going to be hard (and there's nothing to switch to unless I want to run Gentoo with openrc/netrc).
configfile below (public IPs redacted) Thanks for any insight!
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
vrrp_startup_delay 30
vrrp_lower_prio_no_advert true
vrrp_garp_master_delay 1
vrrp_garp_interval 0.005
vrrp_gna_interval 0.0005
script_user root root
enable_script_security
max_auto_priority 99
dynamic_interfaces allow_if_changes
}
interface_up_down_delays {
peering 2
prodint 2
devint 2
prodsvc 2
devsvc 2
devpublic 2
devpxe 2
prodpxe 2
heartbeat 2
vlan992 2
vlan991 2
drbd 2
bridge 2
lacp0 2
}
vrrp_script shgw {
script "/usr/bin/fping -u 1.1.1.1"
interval 5
timeout 5
rise 3
fall 3
}
vrrp_instance PROD {
state BACKUP
nopreempt
dont_track_primary
promote_secondaries
interface peering
virtual_router_id 1
priority "120"
advert_int 1
smtp_alert
virtual_ipaddress {
1.2.3.4/26 dev public96
10.255.255.1/28 dev peering
10.64.0.1/22 dev prodint
10.64.4.1/22 dev prodsvc
10.64.15.1/24 dev prodpxe
192.168.100.72/24 dev vlan992
10.1.0.2/24 dev vlan991
10.64.20.1/22 dev devsvc
10.64.24.1/22 dev devpublic
10.64.31.1/24 dev devpxe
10.64.16.1/22 dev devint
}
virtual_routes {
10.3.0.0/16 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
10.9.0.0/16 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
10.32.32.0/20 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
10.64.4.0/22 dev prodsvc src 10.64.0.6
10.64.20.0/22 dev devsvc src 10.64.0.6
}
virtual_rules {
to 0.0.0.0/0 priority 100 lookup main
}
track_script {
shgw
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
notify_stop "/etc/keepalived/backup.sh"
}
Just a couple of comments on your configuration.
- Using
dont_track_primaryis unusual. Do you really want that? - You have configured interface delays on interfaces you are not using, e.g. lacp0.
I said in my previous post that it was not possible to set a "protocol" for ip addresses in the way that can be done for routes and rules. I have since discovered that kernel commit 47f0bd503210 added exactly that feature, which first appeared in Linux v5.18 and was first supported by v6.4.0 of the iproute utility. I will add support for this in keepalived.
Thank you for taking a look
- Using
dont_track_primaryis unusual. Do you really want that?- You have configured interface delays on interfaces you are not using, e.g. lacp0.
I did both of these in an attempt to make keepalived as "lenient" as possible. I am surprised this is the only obvious extra stuff that's in there :-) networkctl reload/systemd-netwokd restart sometimes cycle the interfaces (generally unpredictable behaviour). In the log I saw keepalived noticed just before a failover and in the end just added everything just in case it was ever neeeded (or if keepalived cared for some reason). It's also easier to just put all the interfaces in there with ansible...
Btw with this config, keepalived sometimes just doesn't execute the backup script on failover. Sadly it was not reproducible, and it occured only after a firewall has been running in MASTER state for some time (like a week). I didn't make an issue because I know the right thing to do is to use the FIFO, but in case you see anything in there that might be causing that... but it could be useful to at least log that keepaliveed is trying to execute it (or isn't for some reason) as all I can say is that it never reached the first line in the script.
I said in my previous post that it was not possible to set a "protocol" for ip addresses in the way that can be done for routes and rules. I have since discovered that kernel commit 47f0bd503210 added exactly that feature, which first appeared in Linux v5.18 and was first supported by v6.4.0 of the iproute utility. I will add support for this in keepalived.
Cool, but systemd-networkd still requires configuration for that (and there's no filtering for "ignore proto keepalived"). Maybe it would be better to use "proto kernel" by default when running under systemd so it gets ignored? I can't imagine anyone not wanting everything to survive systemd-networkd interference. It took me a good while to realize what is happening...