avahi Often / permantent Leaving and Joining multicast group...

Hi, I'm using avahi-daemon 0.7 on a Raspberry Pi4 with Buster (v10). In the syslog I often get a repeating message about leaving and joining multicast group with IPv6 adresses on eth0. And in my FritzBox I have set the option to fix the Raspi's IP address. Could anybody tell me what this message is about and how to fix the perhaps correspondig problem? It somehow seems to limit the performance of the homebridge service: sometimes event triggers e.g. by switches are not perceived by homebridge, so I guess these syslog messages are associated with this problem.

--- syslog excerpt:

May 26 22:25:32 homebridge avahi-daemon[364]: Leaving mDNS multicast group on interface eth0.IPv6 with address 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a. May 26 22:25:32 homebridge avahi-daemon[364]: Joining mDNS multicast group on interface eth0.IPv6 with address fe80::722:5ee8:f9d1:d083. May 26 22:25:32 homebridge avahi-daemon[364]: Registering new address record for fe80::722:5ee8:f9d1:d083 on eth0.. May 26 22:25:32 homebridge avahi-daemon[364]: Leaving mDNS multicast group on interface eth0.IPv6 with address fe80::722:5ee8:f9d1:d083. May 26 22:25:32 homebridge avahi-daemon[364]: Joining mDNS multicast group on interface eth0.IPv6 with address 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a. May 26 22:25:32 homebridge avahi-daemon[364]: Registering new address record for 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a on eth0.. May 26 22:25:32 homebridge avahi-daemon[364]: Withdrawing address record for fe80::722:5ee8:f9d1:d083 on eth0. May 26 22:26:46 homebridge avahi-daemon[364]: Withdrawing address record for 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a on eth0. May 26 22:26:46 homebridge avahi-daemon[364]: Leaving mDNS multicast group on interface eth0.IPv6 with address 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a. May 26 22:26:46 homebridge avahi-daemon[364]: Joining mDNS multicast group on interface eth0.IPv6 with address fd2e:d326:15e:1:f94:5479:14e6:a284. May 26 22:26:46 homebridge avahi-daemon[364]: Registering new address record for fd2e:d326:15e:1:f94:5479:14e6:a284 on eth0.. May 26 22:26:46 homebridge avahi-daemon[364]: Registering new address record for fe80::722:5ee8:f9d1:d083 on eth0.. May 26 22:26:46 homebridge avahi-daemon[364]: Withdrawing address record for fd2e:d326:15e:1:f94:5479:14e6:a284 on eth0. May 26 22:26:46 homebridge avahi-daemon[364]: Leaving mDNS multicast group on interface eth0.IPv6 with address fd2e:d326:15e:1:f94:5479:14e6:a284. May 26 22:26:46 homebridge avahi-daemon[364]: Joining mDNS multicast group on interface eth0.IPv6 with address fe80::722:5ee8:f9d1:d083. May 26 22:26:46 homebridge avahi-daemon[364]: Leaving mDNS multicast group on interface eth0.IPv6 with address fe80::722:5ee8:f9d1:d083. May 26 22:26:46 homebridge avahi-daemon[364]: Joining mDNS multicast group on interface eth0.IPv6 with address fd2e:d326:15e:1:f94:5479:14e6:a284. May 26 22:26:46 homebridge avahi-daemon[364]: Registering new address record for fd2e:d326:15e:1:f94:5479:14e6:a284 on eth0.. May 26 22:26:46 homebridge avahi-daemon[364]: Registering new address record for 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a on eth0.. May 26 22:26:46 homebridge avahi-daemon[364]: Withdrawing address record for fd2e:d326:15e:1:f94:5479:14e6:a284 on eth0. May 26 22:26:46 homebridge avahi-daemon[364]: Withdrawing address record for fe80::722:5ee8:f9d1:d083 on eth0.

Update: meanwhile I've deactived IPv6 in the avahi-config. Restarting avahi-daemon this message occurs:

May 26 23:13:25 homebridge avahi-daemon[31234]: Successfully called chroot(). May 26 23:13:25 homebridge avahi-daemon[31234]: Successfully dropped remaining capabilities. May 26 23:13:25 homebridge avahi-daemon[31234]: No service file found in /etc/avahi/services. May 26 23:13:25 homebridge avahi-daemon[31234]: ***** WARNING: Detected another IPv4 mDNS stack running on this host. This makes mDNS unreliable and is thus not recommende** May 26 23:13:25 homebridge avahi-daemon[31234]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.178.113. May 26 23:13:25 homebridge avahi-daemon[31234]: New relevant interface eth0.IPv4 for mDNS. May 26 23:13:25 homebridge avahi-daemon[31234]: Network interface enumeration completed. May 26 23:13:25 homebridge avahi-daemon[31234]: Registering new address record for 2003:d2:1f16:ab00:3a43:c8d1:2b7b:703a on eth0.*. May 26 23:13:25 homebridge avahi-daemon[31234]: Registering new address record for 192.168.178.113 on eth0.IPv4. May 26 23:13:25 homebridge avahi-daemon[31234]: Server startup complete. Host name is homebridge.local. Local service cookie is 1150634006.

So.... how do I have another mdns stack? I've also installed pihole in a quite convenient way. Could this lead to this problem? How should I analyze this?

Thanks for any help and hints.

Best regards, Matthias

May 26 '21 21:05 gintonic4all

This is a bug I've seen a couple of times but not tracked down. An avahi-daemon restart will stop it. I have a suspicion it may be due to time going backwards but haven't really proven that.

May 27 '21 01:05 lathiat

Yeah, I thought so, too. But, after restarting this bug occurs again - after serval minutes (about 10 - 20 min) running properly. :-/ And I still couldn't figure out what this warning is about... any idea?

And... why doe these IPv6 messages occur, while having deactivated IPv6 in the avahi config?

Here's the config:

[server] #host-name=foo #domain-name=local #browse-domains=0pointer.de, zeroconf.org use-ipv4=yes use-ipv6=no #allow-interfaces=eth0 #deny-interfaces=eth1 #check-response-ttl=no #use-iff-running=no #enable-dbus=yes #disallow-other-stacks=no #allow-point-to-point=no #cache-entries-max=4096 #clients-max=4096 #objects-per-client-max=1024 #entries-per-entry-group-max=32 ratelimit-interval-usec=1000000 ratelimit-burst=1000

[wide-area] enable-wide-area=yes

[publish] #disable-publishing=no #disable-user-service-publishing=no #add-service-cookie=no #publish-addresses=yes publish-hinfo=no publish-workstation=no #publish-domain=yes #publish-dns-servers=192.168.50.1, 192.168.50.2 #publish-resolv-conf-dns-servers=yes #publish-aaaa-on-ipv4=yes #publish-a-on-ipv6=no

[reflector] #enable-reflector=no #use-iff-running=no #enable-dbus=yes #disallow-other-stacks=no #allow-point-to-point=no #cache-entries-max=4096 #clients-max=4096 #objects-per-client-max=1024 #entries-per-entry-group-max=32 ratelimit-interval-usec=1000000 ratelimit-burst=1000

[wide-area] enable-wide-area=yes

[publish] #disable-publishing=no #disable-user-service-publishing=no #add-service-cookie=no #publish-addresses=yes publish-hinfo=no publish-workstation=no #publish-domain=yes #publish-dns-servers=192.168.50.1, 192.168.50.2 #publish-resolv-conf-dns-servers=yes #publish-aaaa-on-ipv4=yes #publish-a-on-ipv6=no

[reflector] #enable-reflector=no #reflect-ipv=no

[rlimits] #rlimit-as= #rlimit-core=0 #rlimit-data=8388608 #rlimit-fsize=0 #rlimit-nofile=768 #rlimit-stack=8388608 #rlimit-nproc=3

May 27 '21 18:05 gintonic4all

try setting publish-aaaa-on-ipv4=no to prevent it from publishing IPv6 addresses over IPv4 (which requires tracking IPv6 interfaces even when IPv6 is otherwise disabled)

Dec 01 '21 14:12 mvduin

Or, if it's indeed related to system time changes then #96 should properly fix the problem (as opposed to disabling IPv6 support entirely which is of course an ugly workaround)

Dec 01 '21 14:12 mvduin

@gintonic4all / @lathiat - I was experiencing a similar thing (also on Debian), and I started by applying all potentially relevant PRs and patches to my Avahi (including #96) with no effect.

I did some further digging and it turns out that, in my case, I am getting RTM_DELADDR messages to the netlink callback, immediately (within a few ms) followed by RTM_NEWADDR messages, for all the IPv6 addresses on my interface, once every few minutes. I am not sure why this happens - it is presumably a Linux bug in the Ethernet driver.

The problem's impact on Avahi is not just cosmetic - it makes Avahi send out unsolicited announcements with TTL 0 (to cancel the address records) and then TTL 120 (to renew), which for short time periods would make advertised services disappear.

I implemented a work-around in avahi-core/iface-linux.c which basically defers actioning of RTM_DELADDR messages by 250ms. If an RTM_NEWADDR is received while there is a deferred RTM_DELADDR in the queue, then both are cancelled. I don't think the work-around is 'production ready' but it is more to validate the approach.

I need to do some more testing but this seems to eliminate the issue.

Jan 18 '22 17:01 adriancable

@adriancable If avahi is receiving RTM_DELADDR followed by RTM_NEWADDR then your machine is genuinely losing and regaining its IPv6 address, and in particular this isn't a bug in avahi, nor does it sound likely to be a bug in the Ethernet driver. Try sniffing your network for IPv6 router solicitation messages to see what's going on.

The workaround you're suggesting does not "eliminate" the issue, it conceals the issue.

Jan 19 '22 00:01 mvduin

@mvduin - absolutely clear that this isn't a bug in avahi per se, assuming what I am seeing is the same issue as reported by the OP (and it does look so).

It isn't clear to me that "concealing" the impact of this issue on Avahi is necessarily the wrong thing to do, if the underlying issue is widespread and it turns out there isn't a ready workaround for that.

I will do some sniffing to see if I can shed more light on what is actually going on.

Jan 19 '22 00:01 adriancable

@mvduin - one other note. I can trigger this happening by doing rdisc6 eth0, so it is definitely related to router solicitations.

I don't have a lot of experience with different IPv6-capable routers. Presumably it is not normal for a router to drop and then re-add IPv6 addresses when it receives a router solicitation message.

I am using an Eero 6 Pro - I wonder what router the OP is using, and if it's a router bug. @gintonic4all ?

Jan 19 '22 00:01 adriancable

@adriancable Well apparently there also are or have been avahi bugs that caused messages like these, but in your case (and that of various other people) it looks more like a router on your network is causing mayhem and Avahi is merely making this issue visible, and even if you stop Avahi from making noise about it, there will still be issues. See also for example here where people reported it also resulted in Google Chrome occasionally interrupting loading the page because "the network has changed".

Jan 19 '22 00:01 mvduin

@mvduin - hmm, I'm not sure if what I am seeing is exactly the same as your example, although clearly it is all related to something being wrong with RAs.

In that case, I think your conclusion of the OP's issue seems to have been caused by preferred_lft falling to 0, with no RA being sent before that happens. That definitely isn't the case for me - my shortest preferred_lft in ip -6 address is 30 minutes and I am seeing this happen at much more frequent intervals, i.e. before any preferred_lft hits zero. Then there was another poster where the issue seems to have been caused by a faulty dnsmasq configuration. (I'm not using dnsmasq or anything similar.)

Jan 19 '22 00:01 adriancable

@adriancable No it's not the exact same issue, I also never said it was nor meant to imply so. It was merely another example of problems with IPv6 connectivity being misdiagnosed as an Avahi problem because it was logging about it, and it was showing that the temporary loss of connectivity was also causing other problems (causing Chrome to interrupt page loads).

Jan 19 '22 01:01 mvduin

Hi, I observed the same problem. Log entries about withdrawing and registering address records seemed to correspond with router advertisement messages in my case. What helped for me was to change from stateless address configuration to DHCPv6. Since I activated DHCPv6 receiving router advertisement messages no longer lead to withdrawing and registering new address records.

Mar 30 '22 21:03 california444

@california444 - yes, that makes sense.

What I did in my case is to modify Avahi to simply defer all address withdrawals by 250ms. If a registration of a new record comes in for an address that's in the defer list, we simply cancel the withdrawal. Otherwise, we withdraw the record after the 250ms timer has expired.

Since the gap between withdrawal and registration after a RAM was around ~10ms for me, the 250ms I allow is quite a reasonable margin and I don't think that a general 250ms delay in address record withdrawal will cause any practical issues.

Looking at logs for the last couple of months, this works around the issue 100% without introducing any side effects.

Yes, purists will say this isn't an Avahi issue blah blah blah but actually, we live in the real world which is full of product X working around issues in product Y. We can fix Avahi more quickly than we can beg router manufacturers to fix their own firmware.

Mar 30 '22 21:03 adriancable

@adriancable did you ever get around to checking what's going on using a packet sniffer? since you said that in your case the problem wasn't a too-short lifetime, so then the big question is what is triggering this, and is it a router bug or a kernel bug

Mar 30 '22 23:03 mvduin

@mvduin - I did not and unfortunately it seems the problem is now fixed on my router. (Eero Pro 6 - at some point after 6.6.0 but <= 6.9.0.)

It happened whenever I received a RAM, which triggers a RTM_DELADDR from the kernel followed immediately by a RTM_ADDADDR. Regardless of where the ultimate “blame” lies for this the right place to fix/workaround is clearly in the kernel, since it doesn’t just affect Avahi (eg also causes ERR_NETWORK_CHANGED in Chrome) and it happens with multiple routers.

In our application however (embedded IoT device) the only real impact was on Avahi so we implemented the 250ms workaround there. That is now widely deployed with no reported issues. Much lower risk than kernel work.

Mar 31 '22 00:03 adriancable

@adriancable Depends on whether the RA is actually handled by the kernel or by a network manager in userspace, e.g. systemd-networkd disables the kernel's RA handling (/proc/sys/net/ipv6/conf/$INTERFACE/accept_ra is set to zero), but dunno what other network managers do.

Mar 31 '22 04:03 mvduin

Interesting. On our device, which uses NetworkManager, /proc/sys/net/ipv6/conf/*/accept_ra is 1.

So presumably, this means that whatever Linux-side issue there may be, it is (at least) in the kernel?

Mar 31 '22 04:03 adriancable

Yep. It also means that people with the same problem could try using systemd-networkd as network manager as a workaround, if it suits their use-case (e.g. embedded or server more likely than desktop, although I personally do also use it on my laptop).

Trivial config for the lazy (save as /etc/systemd/network/99-default.network):

# fallback config for any interface other than loopback
[Match]
Type=!loopback

[Network]
# enable DHCPv4.  note that IPv6 by default uses RA, which will trigger DHCPv6 automatically if needed
DHCP=ipv4

# optional: use dns search domain(s) provided by dhcp and RA
[DHCP]
UseDomains=yes
[IPv6AcceptRA]
UseDomains=yes

Mar 31 '22 05:03 mvduin

Looks like avahi is just the messenger here. It should be fixed in whatever manages network interfaces like that. Closing.

Oct 03 '23 11:10 evverx