esp-homekit
esp-homekit copied to clipboard
Not Responding on ESP32 - mDNS Sends Remove Command after ~2 Minutes
I can turn the device (a lamp accessory on an esp32) on and off for anywhere from 20s to 5min after the esp32 restarts, then iOS says No Response, so I checked the mDNS advertisement as I've seen in other issues. Here is the log from the mdns command
I am on the latest master commit of esp-homekit. And this commit of the esp-idf.
Chip Information from the flash log:
Chip is ESP32D0WDQ6 (revision 1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
Log from make monitor
: Here
Can you do packet capture with Wireshark and make sure it actually sends MDNS packet or it just somehow expires. Usually mDNS is set up with 4500 seconds to expire and with same period it will reannounce itself.
Can you uncomment this line and see more log on what is happening inside mDNS.
Okay, so I now have it running on two different devices. I did a pcap w/ WireShark and found that neither are sending packets. However, both drop off at the exact same time.
21:45:58.301 Add 2 7 local. _hap._tcp. Smart Plug
22:00:11.631 Add 2 7 local. _hap._tcp. Tall Lamp
22:04:35.195 Rmv 1 7 local. _hap._tcp. Smart Plug
22:04:35.195 Rmv 0 7 local. _hap._tcp. Tall Lamp
After uncommenting that line, I received no additional output from mdns. :/
@AidanLovelace did you find solution?
It seems the mDNS remove issue faced not only me! Probably the library issue
I ended up switching to an ESP-8266 which does not have this issue.
@AidanLovelace @maximkulkin I have very similar issue on ESP8266MOD 4mb (if that matters).
In my case I observe that frequently after Rmv
devices do not announce themselves again in short period of time (quite random and hard to measure). But after some time I see again Add
events and then process shall be repeated.
Example:
14:23:58.728 Add 3 4 local. _hap._tcp. Garage Inside
14:23:58.728 Add 2 4 local. _hap._tcp. Garage Door
14:24:12.884 Rmv 1 4 local. _hap._tcp. Garage Inside
14:24:12.884 Rmv 0 4 local. _hap._tcp. Garage Door
14:24:15.047 Add 2 4 local. _hap._tcp. Garage Door
14:24:17.046 Add 2 4 local. _hap._tcp. Garage Inside
14:47:25.499 Rmv 0 4 local. _hap._tcp. Garage Inside
15:02:46.522 Add 2 4 local. _hap._tcp. Garage Inside
If we focus only the device Garage Inside
(added first column with the time diff to previous event):
14:23:58.728 Add 3 4 local. _hap._tcp. Garage Inside
14s 14:24:12.884 Rmv 1 4 local. _hap._tcp. Garage Inside
4s 14:24:17.046 Add 2 4 local. _hap._tcp. Garage Inside
23m 8s 14:47:25.499 Rmv 0 4 local. _hap._tcp. Garage Inside
15m 21s 15:02:46.522 Add 2 4 local. _hap._tcp. Garage Inside
We can see that time interval between the last two events Rmv
and Add
is 15 minutes and 12 sec. For that long, the device was not accessible by the home app and at the same time I could ping the device both by IP and by mDNS name ping Garage\ Inside.local
.
In addition. I have macbook paired with my iPhone setup according to the home configuration. What I also observed is that sometimes one of the devices is available on macbook but not on the iPhone. After some time both devices might be not available on macbook and iPhone, then again both available on iPhone and macbook. I even encounter a case when I had opened the home app on macbook and iPhone, both devices were accessible from the macbook, but only one on iPhone. After killing the home app on the macbook previously unavailable device on iPhone became instantly available.
I'm having the same trouble using my shelly 1 v3 (esp8266ex). Adding and removing all the time. Was using RavenSystem HAA, then compiled and flashed esp-homekit alone, same issue as on HAA. Very strange.
I'm not familiar that much with mDNS machinery but it seems like somehow the issue is only related to _hap
as mDNS for .local
works fine. I will try to collect logs on the weekend from mDNS after unlocking debug logs:
// #define qDebugLog // Log activity generally
// #define qLogIncoming // Log all arriving multicast packets
// #define qLogAllTraffic // Log and decode all mDNS packets
and cross-reference with dns-sd -B _hap._tcp local
output.
It depends on network load (by mdns?) It may work ok for wifi router network, but fail for real life home network. I was tried 3 mdns implementations so far: lwip native, mdnsresponer ported to RTOS SDK and mdns SDK component. Always the same :( service discovery interrupts somehow (even wothout pairing to start)
In my case it was due to wifi ap. Stupid Apple Express Wifi looks having trouble with mdns. I now tried a fritzbox & linksys without any issues.
IMO this could be lwip and/or mdns implementation issue. In my network open and rtos-sdk gives same result
I tried on TP-Link and Linksys and the same issue.
Found mdns implementation that keeps pairing process ok. But now in logs there is paired successfully, but home app still pairing (sends pair setup step 1 again) I guess I hacked something... Anyway, reason was mdns.
This might be a good learning about mDNS with proper HAP support https://github.com/esp8266/Arduino/pull/5442
@nonameplum can u test the PR #123 ?
Sure, but I'm away from home and I don't have access to devices. I will check later this week.
@d4rkmen I checked on three devices the PR and there's not much difference in my case. Still I see RMV
events sometimes not followed by ADD
. In the home app accessories are also not available at that time.
@nonameplum sorry to hear that. it seems esp32 build has other flows
@d4rkmen All three devices on ESP8266.
Still I see
RMV
events sometimes not followed byADD
Ah, you mean WiFi network connect / disconnect?
I mean records from dns-sd -B _hap._tcp local
.
@d4rkmen I exchanged the mdnsresponder.h
usage in hometkit's port.c
with lwip/apps/mdns.h
that I copied to my esp-open-rtos from https://github.com/ourairquality/lwip/tree/esp-open-rtos/src/apps/mdns . Looks optimistic for now but I'm still testing it.
@d4rkmen I made a fork with exchanged mdnsresponder.h
to use mentioned above lwip's mdns implementation. Could you check if it will work for you? For me its seems like availability of devices is more stable. I made also a test change in server.c
to no set the flag client->disconnect
as I encounter sometimes an issue that some connections get into the loop of connect -> process -> disconnect
instead of staying connected which ends up with unavailable device in the home app even though mDNS works as expected, and the device is visible by dns-sd -B _hap._tcp local
.
https://github.com/nonameplum/esp-homekit/tree/my_master