openhab-addons icon indicating copy to clipboard operation
openhab-addons copied to clipboard

[hue] Changed discovery to mDNS; added HTTPS handling; refactor HTTPClient to use jetty shared client

Open cweitkamp opened this issue 2 years ago • 53 comments

  • Changed discovery to mDNS
  • Added HTTPS handling
  • Refactor HTTPClient to use Jetty shared client
  • Refactor to reduce code and simplify implementation

Depends on https://github.com/openhab/openhab-core/pull/2622 and https://github.com/openhab/openhab-core/pull/2635

Philips announced a brand new HUE API v2 being available. These are the first steps to add support for it.

The new API supports some really interesting new features like streaming SSEs. It uses HTTPS only. Additional, they will ~deprecate~ disable UPnP in Q2 2022 in favor of mDNS.

See their Migration Guide to new API.

Signed-off-by: Christoph Weitkamp [email protected]

cweitkamp avatar Dec 22 '21 15:12 cweitkamp

Wow, that's awesome, thank you - I already hoped that you would look into the new hue APIs! 😄 Especially the new SSE support is really nice to finally get instant state updates.

kaikreuzer avatar Dec 22 '21 16:12 kaikreuzer

I am planning to separate the implementation into three steps. This is the first one.

  1. Basic refactoring
  2. Restore current behavior on v2 API - meaning new endpoints, new model - and keep v1 for users who own older Hue Bridges in the background
  3. Add handling of SSE

cweitkamp avatar Dec 22 '21 16:12 cweitkamp

Continuing discussion with @lolodomo from this https://github.com/openhab/openhab-addons/pull/11834#issuecomment-1000286907

So would make sense in case to set it to 0

You could still do that by means of an entry in the '.cfg' file (rather than via the UI).

My concern was just to keep the parameter for the hue binding.

@lolodomo the main reason to remove the parameter was the brilliant idea of @kaikreuzer quoted below. Which would enable new systems to pre-discover potential Things on the network without any bindings having been installed. However if you insist that it must be a binding parameter (instead of an entry in a .cfg file) then we could not develop such a great feature.

I would have loved for a long long time already to extract different UPnP+mDNS discovery services into an "installed by default" bundle, so that the setup wizard could directly identify devices in your home without the need of installing all the bindings first.)

andrewfg avatar Dec 23 '21 13:12 andrewfg

My concern was to have a parameter, whatever the file in which to define it.

By the way, now that we move from UPnP to mDNS, is this grace period still useful for mDNS discovery?

lolodomo avatar Dec 23 '21 13:12 lolodomo

@cweitkamp : do you know if mDMD discovery is available for old v1 bridge?

lolodomo avatar Dec 23 '21 13:12 lolodomo

now that we move from UPnP to mDNS, is this grace period still useful for mDNS discovery?

I will install the JAR from this PR and test it on my Hue bridge to see if the mDNS disco suffers from the same problem as the UPNP disco.

BTW this time of year is a perfect time to test it. In my case the problem (on UPNP disco) only occurs when the Hue bridge is running "heavy" automations, and presumably the automation causes delays in sending the NOTIFY messages. This time of year is when I run a Christmas light colour loop in the front window of my house. This is quite a "heavy" automation.

andrewfg avatar Dec 23 '21 13:12 andrewfg

@lolodomo I am afraid not.

mDNS is available Hue bridge v2.x (squared-shaped) and supported with firmware version 1.23 and beyond.

For older bridges there still is the HueBridgeNupnpDiscovery which retrieves results from https://discovery.meethue.com/. Both methods are recommended by Philips.

cweitkamp avatar Dec 23 '21 13:12 cweitkamp

I will install the JAR from this PR and test it on my Hue bridge

Unfortunately I could not build the binding jar because there is some dependency on PEMTrustManager in the core, which I cannot find. => So @cweitkamp could you please give me a link to the built JAR?

andrewfg avatar Dec 23 '21 17:12 andrewfg

Here you are

@cweitkamp many thanks. I have it loaded on my test system; it discovered the Bridge Ok (presumably via mDNS); and when I instantiated the Bridge as a Thing, it discovered all the lamp Things Ok too. So far so good. I then deleted the Bridge Thing, and deleted everything from the Inbox; loaded up the Bridge with two "heavy" Hue Labs Christmas Automations; and I will let the test system run throughout this evening to see if the Bridge a) disappears from and reappears in the Inbox, or b) remains online stable. I will report back later.

andrewfg avatar Dec 23 '21 19:12 andrewfg

@cweitkamp I see that your JAR does still implement the config param for removalGracePeriod with default at 50 seconds, but for the avoidance of doubt, can you please advise if your code is recognising that param in the mDNS disco participant?

andrewfg avatar Dec 23 '21 20:12 andrewfg

@andrewfg Yes, the current implementation has full support for mDNS grace period configuration via services.cfg file. Set discovery.hue:removalGracePeriod = 0 to disable it.

cweitkamp avatar Dec 23 '21 20:12 cweitkamp

^ Unfortunately so far I only got an indeterminate result..

  • Whether removalGracePeriod has a value of 50 or 1, the Bridge is never removed from the Inbox.
  • HOWEVER even if I completely disconnect the Ethernet cable, the Bridge is still not removed from the Inbox!!

So unfortunately I cannot say (yet) if the Inbox coming/going problem is actually resolved in Hue API v2. I suspect that my router or other device on my LAN is doing (m)DNS caching / re-broadcasting, and I suppose that is why the problem "appears" to be resolved.

Next Steps: Rather than trying to debug my working LAN, or flush the respective mDNS cache, I will assemble a rudimentary LAN and try to test again on that instead. To do that I think I need to buy myself a spare basic router/switch, and it being Christmas, this might take me a day or two.

andrewfg avatar Dec 24 '21 13:12 andrewfg

In the meantime, I think I discovered some bugs.

To be specific, when the core does an mDNS broadcast, the Hue Bridge returns an infinite Time To Live value (-1). Which I think means that once the device has entered into the Inbox, it will NEVER be removed, even if the device is physically disconnected.

So IMHO there are probably two bugs as follows..

  1. There is a bug in the Hue Bridge firmware: It should not return an infinite Time To Live value. Probably this should be reported to Philips.
  2. There is a bug in the OH MDNSDiscoveryService: It should never accept an infinite Time To Live value from a device. Instead it should impose some non infinite time after which it will delete/refresh its cache. So that physically lost devices are finally actually deleted from the Inbox.

Obviously this issue potentially goes beyond the scope of just the Hue binding. => But its probably best to collect comments here first.

andrewfg avatar Dec 26 '21 18:12 andrewfg

^ More..

I just looked at the network traffic using WireShark. In fact the Hue Bridge IS returning valid TTL values (of 120 seconds) in its mDNS response telegrams. => So it seems that the core MDNSDiscoveryService is wrongly parsing the TTL value to INFINITE (-1).

=> I obviously need to dig deeper into this!!

andrewfg avatar Dec 26 '21 18:12 andrewfg

The magic behind mDNS discovery on the local network is done by a third-party library located here: https://github.com/jmdns/jmdns. Without knowing much about the internals of that library I tend to say your observed issue has to be looked up over there. OHC implements the JmDNS ServiceListener and reacts upon the events accordingly.

cweitkamp avatar Dec 27 '21 07:12 cweitkamp

@cweitkamp thanks for the pointer. In fact it seems more complex than that: when the OH mDNS discovery service calls for devices to report in, the Hue actually sends eight different types mDNS answer records (types PTR, TXT, SRV, A, AAAA x2, NSEC x 2). And the HueBridgeMDNSDiscoveryParticipant.getThingUID() method recognises and hits on six of these. So in other words, after one mDNS ping, the Hue binding currently makes six thingDiscovered() call-backs. They all have the same ThingUID so the core only adds one thing to the Inbox. But I suppose that the JmDNS library caches each answer record separately according to its own TTL. And the specific difficulty is that some of the six/eight answer records have a TTL of 120 seconds (which seems reasonable), and some of them have a TTL of 4500 seconds (75 minutes, which seems too long).

So I will try to refine your mDNS code in this PR to 1) to be more selective so that it only hits on just one of the eight types of mDNS answer records, and (therefore) 2) will use the proper TTL value on the cache.

andrewfg avatar Dec 27 '21 13:12 andrewfg

be more selective so that it only hits on just one of the eight types of mDNS answer records

I have been digging through the code, and I think this is impossible; so suggest not changing anything.

use the proper TTL value on the cache.

Nevertheless, I think the JMDNS cache is taking the 4500 second TTL value on the "main" discovery resource record, rather than the 120 second TTL value on the discovery resource record that contains the IP address. This means that a removalGracePeriod would certainly never be needed. But it also means that if the Hue is unplugged or powered off, it will still remain in the Inbox for 75 minutes.

So my proposed next steps are as follows..

  1. I suggest to keep your existing removalGracePeriod code "just in case"; but you could make it so that if the config param is absent the default value is 0.
  2. I suggest add a .withTTL() clause to the result builder instantiation (see below); and I suggest the value 120 seconds since this is exactly the value that the Hue bridge provides in the TTL value it provides on its discovery resource record containing the IPv4 address. (i.e. hard code the value that is actually observed, even though it apparently cannot be read programmatically via calls to the JmDNS bundle). This means that if the bridge is unplugged or powered off, it will disappear from the Inbox in a reasonable time.

image

andrewfg avatar Dec 27 '21 16:12 andrewfg

@andrewfg Thank you very much for the analysis. I incorporated your suggestions from https://github.com/openhab/openhab-addons/pull/11842#issuecomment-1001633159.

cweitkamp avatar Dec 29 '21 10:12 cweitkamp

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/java-connection-to-https-with-self-signed-cert/130593/8

openhab-bot avatar Jan 09 '22 09:01 openhab-bot

Will this fix address the bridge going offline/online within milliseconds of each other?

https://community.openhab.org/t/oh-3-x-bug-things-going-offline-online-within-milliseconds/131356

Best, Jay

jaywiseman1971 avatar Jan 09 '22 14:01 jaywiseman1971

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/hue-hue-v2-api-what-will-be-the-impact-for-the-binding/129910/3

openhab-bot avatar Jan 10 '22 20:01 openhab-bot

@lolodomo Thanks for your review. I incorporated your suggested changes and pushed a new version.

cweitkamp avatar Jan 18 '22 11:01 cweitkamp

@cweitkamp / I will try to finish my last part review very soon. Note that there is now a conflict to solve. Do you still expect more tests ? I could test the new discovery stuff with my old bridge but that's all. Would you like this PR merged and be included in 3.3M1 or do you prefer we wait ?

lolodomo avatar Jan 26 '22 23:01 lolodomo

I'd prefer to see it in milestone 3.3.M1. But in advance we have to merge https://github.com/openhab/openhab-core/pull/2695 too. Maybe @openhab/core-maintainers can take care of that soon.

cweitkamp avatar Jan 27 '22 08:01 cweitkamp

^ I was planning on giving this new version a good testing on my operative system. But I can’t do that until the core has been updated to remove the dependency. So in the meantime (speaking for myself) the new binding version is operationally untested. Except for the discovery issue, that seems to have been resolved.

andrewfg avatar Jan 27 '22 08:01 andrewfg

@cweitkamp : review finished

lolodomo avatar Jan 29 '22 09:01 lolodomo

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/hue-motion-sensor-problem/133033/2

openhab-bot avatar Feb 07 '22 21:02 openhab-bot

@cweitkamp - considering the migration to mDNS in Q2/2022, would it be possible to have a look at the few remaining comments from @lolodomo, so we could have this merged soon? :)

jlaur avatar Apr 26 '22 21:04 jlaur

@cweitkamp : could you please at least comment ? If you have no time, someone can probably finish, starting from your current code.

lolodomo avatar Jun 04 '22 05:06 lolodomo