nmos-cpp
nmos-cpp copied to clipboard
avahi 'leaking' memory in dbus-daemon
We are running nmos-cpp as a custom node on a Linux system, with avahi-daemon to do mDNS. With this we are seeing a slow but steady increase in memory usage of the dbus-daemon. After digging into this we found out that this is caused by the combination of the avahi-compat-libdns_sd library and nmos-cpp.
Flow:
-
the node registers a service using service_advertiser
-
this calls DNSServiceRegister and stores the created DNSServiceRef in 'services' vector (member of service_advertiser_impl)
-
the avahi-compat library creates a dbus connection for each DNSServiceRef instance, to communicate with the avahi-daemon.
-
avahi-compat also couples this dbus connection with its own event loop (avahi_simple_poll, stored in DNSServiceRef) to process incoming messages.
-
the event loop is not ran unless triggered through DNSServiceProcessResult
-
the incoming dbus messages are not processed, eventually leading to a blocked socket between nmos-cpp and dbus-daemon
-
every new dbus connection sends a message to all dbus connections (NameOwnerChanged), dbus-daemon places the message in a queue
-
the nmos-cpp advertisement socket is blocked, so the queue is never emptied. This leads to a growing number of messages in the message queue, translating to a growing memory usage in the dbus-daemon process.
The real issue is arguably in dbus-daemon for not having a mechanism to clear the messages, or in avahi-compat for not processing the incoming messages. However it might be helpful to have this information and maybe workaround it in nmos-cpp.
I think one solution would be to use the proper avahi-client library for mDNS on Linux (instead of the compat library), though this requires some more work. We are now running a loop in nmos-cpp that calls DNSServiceProcessResult, so that avahi processes the incoming dbus messages.
I've attached a patch that shows this; please note it will only compile on non-Windows and lacks documentation. fix-dbus-leak.patch It's currently not ready for a PR but should that be desired I might try to clean it up for a PR.
avahi version: 0.7 dbus version: 1.12.6 (also tested 1.12.20) nmos-cpp: f607cd7e8278c72a7c99509d9887d4055e1e953f
Reproduction: start nmos-cpp node with advertisement (verify it's advertising), run a test program to spam dbus connections: dbus_hammer_minimal.cpp
Hi Fabio,
Thank you for using nmos-cpp, and your findings. We will try it out over here, btw how about trying out Apple's mDNSResponder (also known as mdnsd) instead of Avahi, details can be found at https://github.com/sony/nmos-cpp/blob/master/Documents/Dependencies.md#dns-service-discovery.
Hi Simon,
Thank you for your response. Currently we are already using Avahi and we cannot mix them together, but it could be an alternative in the future, so I'll remember this suggestion.