architecture icon indicating copy to clipboard operation
architecture copied to clipboard

Unique IDs for entities of integrations without a physical unique ID

Open OnFreund opened this issue 4 years ago • 32 comments

Context

Recently @balloob and I discussed how to create unique IDs (for the purpose of adding to the entity registry) for entities where the integration has no real unique ID. The classic use case for this is a hub-like device that creates several entities, but can have more than one instance.

Proposal

The proposal is to give the user control over the unique ID, by allowing them to set an entity namespace, which will serve as a prefix for entities created by that instance of the integration. This has several benefits:

  • There's a difference between the "physical" unique ID, and the "logical" one. In many cases the user actually cares more about the latter, and this aligns with this preference (e.g. when changing the physical device but keeping everything connected in the same way, the user in many cases will perceive this to be the same unique instance, and vice versa - keeping the same device but connecting in a different way might require a new namespace).
  • Service calls for that integration can potentially accept an entity_namespace parameter, which, if passed, will be used instead of/in addition to entity_id, to target all the entities created an instance.

In its most basic form, this is simply a pattern that does not require any changes to core. However, if we support this in core, it opens up a few other benefits:

  • A cleaner migration path - the entity registry can implement a mechanism to change the namespace of an existing entity, thus allowing easy migration of both "legacy" integrations, and cases where the user decides to change the namespace.
  • Core support can convert the entity_namespace parameter in service calls into a list of entity ids, thus allowing integrations to support this parameter without changes to the service handling code.

An example implementation of the pattern (without core support) is the current implementation in https://github.com/home-assistant/home-assistant/pull/30337. The namespace differentiates between unique instances of a Monoprice multi-zone amplifier.

Consequences

Assuming no core changes, this has no consequences at all on integrations that don't implement the pattern. Even with core support, there shouldn't be any effect. For the integrations that do implement the pattern, there's a breaking change to introduce the required namespace parameter (unless we implement support in core).

OnFreund avatar Jan 07 '20 07:01 OnFreund

What about non physical related identities such as the utility_meter ? Would it also fit into this proposal ?

dgomes avatar Jan 07 '20 08:01 dgomes

@dgomes potentially yes, but I'm not sure what would be the benefit for utility_meter - I haven't used it, but from what I understand the user is in complete control of the entities it creates.

OnFreund avatar Jan 07 '20 08:01 OnFreund

Something along these lines: https://github.com/home-assistant/home-assistant/issues/22970

dgomes avatar Jan 07 '20 09:01 dgomes

Also assigning it to areas which you can't do without unique id. Just for reference an alternative/additional approach is: https://github.com/home-assistant/home-assistant/pull/27207

elupus avatar Jan 07 '20 09:01 elupus

@dgomes I think that adding a name parameter to the configuration might be a better route for that.

@elupus for areas you're also going to need a device, which requires a config flow, so an entity id is not enough. Thanks for sharing the PR - really interesting. I think this comment from it by @balloob hints at the solution I'm proposing here:

I much rather explore changing rules so that integrations can try to generate unique IDs than that we try to have Home Assistant generating unique IDs.

OnFreund avatar Jan 07 '20 09:01 OnFreund

So this proposal needs to be tailored down a lot.

  • If a device has a unique ID, we should use that.
  • If a device has an IP address, we should use the mac address as unique ID.
  • Only if no unique ID is available at any way, can we make it user configurable. It will be up to the integration to include this in their configuration schema.
  • If an entity/integration is not connected to external devices and services but only uses Home Assistant data, they should use the collection helper and allow people to configure IDs.

balloob avatar Jan 07 '20 13:01 balloob

I just object to the usage of mac from IP, since it will not be possible for users that want their iot devices on separate wlan/subnet. It's fine if it's used by default, but some way of overiding/manual config would be good. (Could be provided as a manual config for the getmac helper)

elupus avatar Jan 07 '20 13:01 elupus

I think these are great rules of thumb, but I can imagine cases where I'd prefer a logical unique ID over a physical one. For example, if the device is relatively cheap and/or has a limited life span, and is replaced often. I think it's best left to the integration developer, but with the strong recommendation to use the physical device ID if one exists.

OnFreund avatar Jan 07 '20 15:01 OnFreund

I don't think we need to solve all edge cases. VLAN is probably that.

I fear that if we don't have strict rules that developers will take the easy way out and add a config option, shifting the burden to the user and adding code maintenance of another option.

MartinHjelmare avatar Jan 07 '20 16:01 MartinHjelmare

Guest WLAN is also ruled out, as well as VPN links to Sommer home.

elupus avatar Jan 07 '20 16:01 elupus

Another alternative is that if there is no unique ID, we require integrations to import the config into a config entry so that the config entry ID can be used.

balloob avatar Jan 07 '20 16:01 balloob

That's similar to what I suggested as option 4 in the Monoprice PR. I think it's a great solution, but it misses some of the benefits of the proposed one (e.g. being able to target the entities of a specific integration). I also think that with the current state, the bar to a config entry is sometimes high, especially when lots of configuration is needed.

OnFreund avatar Jan 07 '20 17:01 OnFreund

So what is the best way forward here?

OnFreund avatar Jan 13 '20 21:01 OnFreund

If no unique ID available, make it use the config entry id.

balloob avatar Jan 14 '20 06:01 balloob

But that's a very high barrier. For example, looking at Monoprice, my planning was:

  1. Add a unique ID
  2. Wait one version so that everyone gets a chance for their entities to be added to the registry
  3. Transition the configuration to a config flow (a much larger undertaking, and in this case there's additional complexity - until the entities are in the registry, it's very hard to migrate to a config flow since the configuration mostly consists of entities and their names. I'm also yet unsure how to handle frequently changing configuration, such as source names).
  4. Release the config flow version, and since the entities are already in the registry, migration should be relatively seamless.

Is there a way to get a config entry id without a config flow?

OnFreund avatar Jan 14 '20 06:01 OnFreund

It's not possible to get a config entry ID without a config flow. It is possible to have a config flow without a UI and purely an import from configuration.yaml step that is triggerd inside async_setup.

Now with our scaffolding it's not actually very hard to add a config flow. Just run python3 -m script.scaffold config_flow

balloob avatar Jan 14 '20 22:01 balloob

Thanks. Is there a good example of an integration with a config flow without a UI and just an import from the yaml?

OnFreund avatar Jan 15 '20 08:01 OnFreund

Not sure it's good and it's not using the scaffolding, but one example is arcam fmj. (At least until the updated flow is added)

elupus avatar Jan 15 '20 08:01 elupus

@balloob looking at the arcam fmj example noted above, it seems like in order to import from yaml, there's a need to check if an entry already exists when importing, which implies some unique identification and we're back to square one...

OnFreund avatar Jan 15 '20 09:01 OnFreund

You can assume only one import exist. Or use some differencing data in config if there is such even if it's not unique.

I do forsee some problems with config entry unique id thou since entities names will change on a reconfiguration (removal and re-add of entry)

elupus avatar Jan 15 '20 13:01 elupus

@elupus this issue originated in order to avoid supporting only one instance, and with no clear differentiator.

I completely agree with the second point, which is why I still think a namespace as originally proposed is a better solution than a config entry id.

OnFreund avatar Jan 15 '20 13:01 OnFreund

@balloob looks like we're at a deadlock here - how do you suggest we proceed?

OnFreund avatar Jan 19 '20 06:01 OnFreund

I'll add this to the list of architecture issues that require a decision. I am planning on scheduling a call with the core team soon to hash a couple of these out.

balloob avatar Jan 19 '20 06:01 balloob

I'll add my two cents since the ElkM1 integration had this problem. Further I'm working on UPB integration which also has the same problem. There's just no unique id that either of these devices have.

The ElkM1 can be connected through either serial port or over TCP. Same with UPB.

@balloob you make a suggestion above to use MAC if device has an IP. I'm OK with that if the hass platform can provide reliable services to do that. In my experience there are too many cases that you cannot (through software) get the MAC for an IP. Further solutions such as looking in the ARP cache are non-portable and only work when everything is on the same network, which more and more people are implementing multiple subnets.

In the case of ElkM1, the platform does not provide any way to get the MAC. In addition support for both serial and MAC means that the ElkM1 may or may not have a MAC.

@MartinHjelmare you say "I fear if we don't have rules for this developers will take the easy way out". I believe guidelines (not rules) are absolutely required and get enforced with code review. I've had my code reviewed by you and I'm happy that level of detail and questioning are applied to them.

So, I support an architectural decision on this. My preference is something that is simple to implement. The namespace solution would be easy for developers to implement. I won't go so far to say "I recommend that" because there are other possible solutions, discussed in this thread, that may also be easy to implement and fit the architectural goals even better.

Having been involved with leading architecture teams during many years of my career I understand the difficulty of picking when "easy" conflicts with "right". I hope the right balance can be struck here.

gwww avatar Jan 30 '20 03:01 gwww

After thinking more about it, I think that using the config entry ID is a good enough solution. The concern by @elupus that entity customization will be lost is exactly how current implementation works. When you remove a config entry, all related devices and entities will be removed from their registries.

I don't believe that adding a config entry is a high barrier. python3 -m script.scaffold config_flow and you're pretty much there.

balloob avatar Jan 30 '20 06:01 balloob

@balloob I'm not sure what's the definition of "good enough", or how valid concerns are so easily dismissed, but, at least for Monoprice, which was the impetus for this arch issue, not only is it not good enough, it's not a solution at all.

  1. A config entry is a high barrier - just look at how long it takes to approve PRs with config entries, and how some contributors are reluctant to get into it. While the scaffold is great, configuration is not as easy as in YAML (for example, validation). This isn't necessarily a bad thing, but the very fact cannot be denied.
  2. The UI is limited, and complex configuration (such as Monoprice) is extremely hard to do.
  3. Using a config entry with a YAML configuration could have been a good step in between, but it requires a unique ID, so back to square 1.
  4. The Monoprice configuration mostly consists of items that mimic some of the behavior of the entity registry. The whole point of my PR was to leverage the entity registry, simplify the configuration, and then move to a config entry when the configuration is simple and can easily be ported over to UI.
  5. If you remove a config entry, and re-add it with identical IDs, I believe customizations will still work (as well as automations, UI, etc...)
  6. The fact that removing a config entry and re-adding it is not an easy migration, does not justify a perpetuation of this behavior. My belief is that we should either make it easier on the end user, or reduce the cases that require it.

In any case, progress on Monoprice is now completely blocked.

OnFreund avatar Jan 30 '20 07:01 OnFreund

Just discovered that there's at least one integration that's already implementing the "namespace" pattern, albeit via a different name - the MQTT alarm platform uses a unique_id parameter to differentiate between different alarms. Not exactly the same, since there's only one entity (if I understand correctly), but the same spirit of letting the user be explicit about differentiating between multiple instances.

OnFreund avatar Jan 31 '20 14:01 OnFreund

I don't feel like continuing this discussion as now all of a sudden the lack of people helping out with PRs is brought up as a reason not to accept the easiest solution 🤷‍♂

balloob avatar Feb 01 '20 08:02 balloob

the lack of people helping out with PRs is brought up as a reason not to accept the easiest solution

That is not what I said. What I did say is that you might be misjudging how easy it really is.

In any case, that's also not my main argument against it - the main one is that it doesn't solve the problem for the case that started this discussion. I agree with @gwww that guidelines are better than rules, but if we are looking to adopt rules, I believe we should be adopting ones that don't prevent us from evolving existing integrations.

I might be missing something here, but how do I propose migrating Monoprice to use config entries without a unique id?

OnFreund avatar Feb 01 '20 08:02 OnFreund

I understand there's no desire to continue the conversation, just want to make sure we explicitly document the consequences - the decision to use the entry id as a unique id means that integrations without a physical unique id that want their entities in the entity registry, will not only have to have a config flow, but will also have to transition to UI configuration, i.e. a breaking change.

As for the specific Monoprice case, I think the only option left is to turn this into a larger change, and take advantage of the fact it's a breaking one to simplify the configuration. The downside, other than it being a breaking change, is that it's going to be a big monolithic PR. My initial plan was a series of smaller changes that are mostly transparent.

OnFreund avatar Feb 04 '20 17:02 OnFreund