avahi
avahi copied to clipboard
avahi-daemon blocks mDNS for others
Setup:
- Avahi 0.7
- config:
use-ipv4=yes
use-ipv6=no
allow-interfaces=eth0,eth1
#deny-interfaces=eth0
#check-response-ttl=no
#use-iff-running=no
#enable-dbus=yes
disallow-other-stacks=no
#allow-point-to-point=no
#cache-entries-max=4096
#clients-max=4096
#objects-per-client-max=1024
#entries-per-entry-group-max=32
ratelimit-interval-usec=1000000
ratelimit-burst=1000
[wide-area]
enable-wide-area=yes
[publish]
#disable-publishing=no
#disable-user-service-publishing=no
#add-service-cookie=no
#publish-addresses=yes
publish-hinfo=yes
publish-workstation=yes
#publish-domain=yes
#publish-dns-servers=192.168.50.1, 192.168.50.2
#publish-resolv-conf-dns-servers=yes
#publish-aaaa-on-ipv4=yes
#publish-a-on-ipv6=no
[reflector]
enable-reflector=no
#reflect-ipv=no
[rlimits]
#rlimit-as=
#rlimit-core=0
#rlimit-data=4194304
#rlimit-fsize=0
#rlimit-nofile=768
#rlimit-stack=4194304
When avahi-daemon is running, our custom software cannot discover some of services on the LAN. Our software uses mDNS discovery. When avahi-daemon is down, our custom software discovers all of the services on the LAN.
Any help? Is it possible that avahi blocks other software in discovery with mDNS?
+1 ;)
Well, AFAIK you're supposed to cooperate with the running mDNS daemon, as per the spec, so if Avahi has the socket bound for mDNS and you cannot listen too, then you can't receive replies. The good way is to write your app against dns_sd.h and the better way is to interface with Avahi directly through libAvahi-client or DBus.
@Piero512
Thanks for the answer. The thing is that when running in parallel with Avahi, our software can discover just some of the services. For now, we handled our case by disabling avahi.
if Avahi has the socket bound for mDNS and you cannot listen too, then you can't receive replies
@Piero512 This is incorrect. Multiple udp sockets can bind to the same address if SO_REUSEADDR is enabled (which avahi does unless you explicitly configure disallow-other-stacks=yes
) and multicasts will be delivered to all sockets that have joined the multicast group. If you enable the IP_MULTICAST_LOOP option (which avahi does) then you'll also receive multicasts sent by other sockets on the same machine, allowing mDNS to work properly between different mDNS stacks on one system, provided all of them are implemented properly.
That last condition is probably the issue: most likely the custom software is buggy. In particular, you must not bind to port 5353 unless you're a full, properly working mDNS stack, and you if you use SO_REUSEADDR then you also need to enable IP_MULTICAST_LOOP. If you just want to do simple queries without implementing a proper mDNS stack, or you're performing queries that request unicast replies, you must not use port 5353 as source port, use a random source port instead (without SO_REUSEADDR and without binding to any multicast address).
(Also, do not send unicast queries to port 5353. While the RFC says that "in specialized applications there may be rare situations where it makes sense" to do so, without explaining why you'd ever want to do that, in practice this is a bad idea and will not work reliably if there are multiple mDNS stacks on that system.)
most likely the custom software is buggy
It seems to be the case. Just to make sure I ran both avahi and systemd-resolved (with its mDNS responder mode enabled) and they had no problem with each other in the sense that it was possible to resolve stuff via both avahi-resolve/browse and resolvectl and their caches were populated properly. Closing.
The problem is that you cannot bind a unicast socket to multiple applications. So legacy unicast is broken. It's not often used, except, many wifi networks deploy "multicast optimisation" which turns multicasts into unicasts which then hits this issue.
But there's nothing we can do about it. So still valid to close.
If you're hitting this issue, turning off multicast optimisation on your wifi will likely fix it. As legacy unicast is otherwise rarely used.
Having multiple applications bound to 5353 (with SO_REUSEADDR) is not a problem for legacy resolving at all. For legacy resolving the client binds to a random source port, sends a multicast to port 5353 (no problem, all applications bound to 5353 receive the multicast) and receives unicast responses (no problem, any application bound to 5353 can send unicast replies).
The only things for which having multiple applications bound to 5353 is a problem is:
- Receiving unicast replies to 5353, but this only happens if you explicitly ask for unicast replies in a query by setting the QU flag (as optimization in cases where you believe noone else will care about the response, e.g. cache revalidation after link up), so the solution is to just refrain from using the QU flag if you've enabled SO_REUSEADDR.
- Receiving direct unicast queries to 5353. The spec allows this but offers no use-case for it, other than "for specialized applications there may be rare situations where it makes sense". I've never seen anything use this.
Can you elaborate on the wifi "multicast optimization" thing? I'm not sure what you could be referring to.
"Multicast-to-unicast conversion" makes them unicast at the wifi MAC layer, the destination IP remains the same so the OS should presumably treat it the same. APs generally also disable it if there's too many interested clients since doing this conversion turns a single multicast packet into as many unicast packets as there are interested clients.
Having said that, it relies on IGMP snooping to know who's interested in what, and IGMP snooping bugs can definitely create havoc.
Receiving direct unicast queries to 5353. The spec allows this but offers no use-case for it, other than "for specialized applications there may be rare situations where it makes sense". I've never seen anything use this.
I use this :-) It's the most reliable way to use tools like dig
to resolve mDNS things. I agree it isn't very convenient because it's necessary to manually request and follow all the PTR, SRV, TXT records but sometimes "unicast" tools is all I have.
It's the most reliable way to use tools like
dig
to resolve mDNS things
No it's currently the only way to use dig for that purpose, but I wouldn't call it "reliable" since it breaks if there's another application that binds to port 5353 on your target (e.g. Google Chrome).
It would however certainly be useful to either add proper mDNS querying support to dig or have a similar tool dedicated to mDNS.
BTW:
@bep:~$ dig @224.0.0.251 -p 5353 -b 10.0.11.1 nena.local
; <<>> DiG 9.16.37-Debian <<>> @224.0.0.251 -p 5353 -b 10.0.11.1 nena.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5990
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;nena.local. IN A
;; ANSWER SECTION:
nena.local. 10 IN A 10.0.11.27
;; Query time: 0 msec
;; SERVER: 10.0.11.27#5353(224.0.0.251)
;; WHEN: Thu Oct 05 18:09:40 CEST 2023
;; MSG SIZE rcvd: 44
Similarly dig @ff02::fb%3 -p 5353
works too (scope id must be numerical, it doesn't accept an interface name).
Or I should say it used to work, they broke it with bind 9.18 when they switched how they manage sockets. Previously they used non-connected sockets while now they use connected sockets hence they never even see the replies because they have the "wrong" source address.
Honestly this seems like a bad change anyhow, I think a diagnostic utility like dig ought to listen for replies with any source address, at the very least to be able to alert the user when this happens (e.g. due to a broken NAT setup or other funky things going on).