hc icon indicating copy to clipboard operation
hc copied to clipboard

Intermittent discovery issues

Open yangm97 opened this issue 4 years ago • 20 comments

It appears that I'm having some intermittent discovery issues. I have avahi running on the same machine and this pops up on the avahi log:

*** WARNING: Detected another IPv4 mDNS stack running on this host. This makes mDNS unreliable and is thus not recommended. ***

I wonder if that is somehow related?

yangm97 avatar Sep 09 '19 16:09 yangm97

I wonder if that is somehow related?

Yes, this library uses dnssd to make the accessory discoverable on the local network.

brutella avatar Sep 19 '19 06:09 brutella

I see. What would be the cleanest way to solve this issue? Drop avahi and change the dnssd listen method to take a dnssd object, which would be initialized with my other services?

yangm97 avatar Sep 23 '19 15:09 yangm97

I think that the best way would be to just use Avahi where possible (ex. on Linux).

brutella avatar Sep 26 '19 09:09 brutella

A random thought just came to my mind, which may resolve this issue.

What about setting the hostname of the announced Bonjour service NOT to the local hostname? You would have to explicitly set the Host in the dnssd.Config like so:

dnsCfg := dnssd.Config{
    ....
    Host: "Testing",
    ....
}

Does this help?

brutella avatar Dec 07 '19 21:12 brutella

Could you please test it with the hostname branch?

brutella avatar Feb 21 '20 07:02 brutella

Sure!

yangm97 avatar Feb 21 '20 23:02 yangm97

Instead of getting a red "Unreachable" message I now get this gray "Unavailable", otherwise it's the same deal as before (reachable from iPhone but not from mac).

I will try disabling avahi so we can isolate that.

yangm97 avatar Feb 22 '20 15:02 yangm97

homed log is a bit spammy, do you know what I should be looking for when this happens?

yangm97 avatar Mar 03 '20 11:03 yangm97

Re: Unavailable What type of accessories are you creating? Looks like they are incompatible with Apple Home.

brutella avatar Mar 03 '20 13:03 brutella

Pretty sure they're compatible, since that message comes and goes. I could take some snippets from my code if that helps though, but I guess it is just standard stuff.

yangm97 avatar Mar 14 '20 13:03 yangm97

We have this issue too, with any configuration, including hklight. The Bridge/Accessory shows up in Discovery under the hap service for anywhere from 5 minutes to a day, but eventually vanishes and only returns when restarting the process - it also turns unavailable in homekit at the same time.

We previously ran homebridge on the same box, but don't anymore, just a single hc instance. Nothing else binding to udp/5353.

Debug logs show nothing about discovery, only the hub and other clients eventually disconnecting.

ghost avatar May 19 '20 11:05 ghost

@schittler Are you using the latest version 1.2.2 of hc?

brutella avatar May 19 '20 12:05 brutella

yes, our go.mod looks like this:

go 1.14

require (
	github.com/brutella/hc v1.2.2
	github.com/miekg/dns v1.1.29 // indirect
	github.com/rs/zerolog v1.18.0
	github.com/spf13/cobra v1.0.0
	github.com/spf13/viper v1.4.0
	github.com/xiam/to v0.0.0-20200126224905-d60d31e03561 // indirect
	golang.org/x/crypto v0.0.0-20200510223506-06a226fb4e37 // indirect
	golang.org/x/net v0.0.0-20200513185701-a91f0712d120 // indirect
	golang.org/x/sys v0.0.0-20200515095857-1151b9dac4a9 // indirect
)

ghost avatar May 19 '20 13:05 ghost

On what hardware are you running it?

brutella avatar May 19 '20 14:05 brutella

x86_64 linux, bare metal, ethernet connected to a bridge for some libvirt VMs - our binary runs on bare metal though

image

If you'd like I can clean up the source and push it later, maybe we're screwing up somewhere colossally we're not aware of.

ghost avatar May 19 '20 15:05 ghost

I can report back that after more than a year of not seriously using homekit due to issues like this, we have found the problem.

Check if somewhere in your home, you have a managed switch with IGMP turned on. IGMP is useful in enterprise networks but will lead to all sorts of multicast failures (including multicast DNS) if not carefully configured in your entire subnet, especially if some of your switches do talk IGMP and some do not (linux and hyper-v bridges included). If you have an IoT subnet with mDNS reflection things might get even worse.

In our case, it was a TP-Link managed switch that even logged to its web interface how it had stopped delivering multicast to ports with unmanaged switches behind it, because at that point it becomes a race condition for which device behind it will respond first - and on a linux bridge that's very often the host.

ghost avatar Jul 26 '20 11:07 ghost

I've been experiencing very similar behaviour, with hc v1.2.3, though I do not have any managed switches on my network.

@brutella are the changes mentioned in https://github.com/brutella/hc/issues/179#issuecomment-658089093 aimed at fixing this behaviour? I'll test it out at some point soon to see if it helps...

hairyhenderson avatar Nov 16 '20 02:11 hairyhenderson

I do not have any managed switches on my network.

Check your AP. UniFI has it turned on by default as well.

A good way to test this is to check with any mDNS client if the accessory shows up, from multiple points in your network. This is how we originally diagnosed this issue.

ghost avatar Nov 17 '20 11:11 ghost

Check your AP. UniFI has it turned on by default as well.

I have Google WiFi mesh devices, I wonder if these do... I'm not sure how to figure that out as the management app is pretty limited.

A good way to test this is to check with any mDNS client if the accessory shows up, from multiple points in your network.

That's a good idea - from WiFi I'm seeing it appear for a few minutes, then a new one appears with a -2 suffix, then another one with -2-2, ad nauseum for ~5mins, after which they all disappear. I'll see what I can see directly on ethernet though!

hairyhenderson avatar Nov 21 '20 00:11 hairyhenderson

FYI I've refactored hc and made a new library out of it. The new library is available as hap.

Please check if your issue is resolved by the new implementation.

brutella avatar Mar 02 '22 13:03 brutella