room-assistant
room-assistant copied to clipboard
Incorrect leader elected
Describe the bug I have a cluster with two nodes and the wrong node gets elected as leader.
The main node is livingroom
with weight 100. I also have a second node bedroom
that should not be the leader. However, as you can see in the logs below, the bedroom got elected as the leader.
Relevant logs
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[20:29:44] INFO: Setting up Home Assistant configuration
[20:29:44] INFO: Starting room-assistant
*** WARNING *** The program 'node' uses the Apple Bonjour compatibility layer of Avahi.
*** WARNING *** Please fix your application to use the native API of Avahi!
*** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
*** WARNING *** The program 'node' called 'DNSServiceRegister()' which is not supported (or only supported partially) in the Apple Bonjour compatibility layer of Avahi.
*** WARNING *** Please fix your application to use the native API of Avahi!
*** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
5/11/2020, 8:29:44 PM - info - IntegrationsModule: Loading integrations: home-assistant, bluetooth-classic
5/11/2020, 8:29:44 PM - info - NestFactory: Starting Nest application...
5/11/2020, 8:29:44 PM - info - InstanceLoader: AppModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ConfigModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: NestEmitterModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: IntegrationsModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: DiscoveryModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: HomeAssistantModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ClusterModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ScheduleModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: BluetoothClassicModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: EntitiesModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: StatusModule dependencies initialized
5/11/2020, 8:29:44 PM - info - RoutesResolver: EntitiesController {/entities}:
5/11/2020, 8:29:44 PM - info - RouterExplorer: Mapped {/, GET} route
5/11/2020, 8:29:44 PM - info - RoutesResolver: StatusController {/status}:
5/11/2020, 8:29:44 PM - info - RouterExplorer: Mapped {/, GET} route
5/11/2020, 8:29:44 PM - info - HomeAssistantService: Successfully connected to MQTT broker at mqtt://core-mosquitto:1883
5/11/2020, 8:29:44 PM - info - ConfigService: Loading configuration from /usr/lib/node_modules/room-assistant/dist/config/definitions/default.js, config/default.json, config/local.json
5/11/2020, 8:29:44 PM - info - ClusterService: Starting mDNS advertisements and discovery
5/11/2020, 8:29:44 PM - info - NestApplication: Nest application successfully started
5/11/2020, 8:29:45 PM - info - ClusterService: Added 192.168.0.???:6425 to the cluster with id bedroom
5/11/2020, 8:29:46 PM - info - EntitiesService: Refreshing entity states
5/11/2020, 8:30:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:30:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:32:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:36:50 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:51:13 PM - info - ClusterService: bedroom has been elected as leader
5/11/2020, 9:34:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/12/2020, 1:35:30 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 2:28:01 AM - info - ClusterService: Removed 192.168.0.???:6425 from the cluster with id bedroom
5/12/2020, 2:28:04 AM - info - ClusterService: Added 192.168.0.???:6425 to the cluster with id bedroom
5/12/2020, 3:59:41 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 4:06:40 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 9:04:09 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 9:39:30 AM - info - ClusterService: bedroom has been elected as leader
Relevant configuration Paste the relevant parts of your configuration below.
living room
global:
instanceName: livingroom
integrations:
- homeAssistant
- bluetoothClassic
cluster:
weight: 100
bluetoothClassic:
interval: 60
addresses:
??
??
??
??
??
bedroom
global:
instanceName: bedroom
integrations:
- homeAssistant
- bluetoothClassic
cluster:
quorum: 2
weight: 1
homeAssistant:
mqttUrl: 'mqtt://??????:1883'
mqttOptions:
username: mqtt
password:?????????
bluetoothClassic:
interval: 60
addresses:
??
??
??
??
??
Expected behavior I would expect the livingroom node to be the leader.
Environment
- room-assistant version: 2.6.0
- installation type: Hass.io in livingroom, and RPI in bedroom.
- hardware: Docker and RPI 3
- OS: Ubuntu Server and Raspian
The weights for the leader election are more guidelines than hard rules. When connecting instances together the leader is chosen by the following logic:
- instance boots, no other nodes to connect to -> elects itself as leader after a short timeout
- cluster with leader exists, instance boots, connects without having elected itself yet -> accepts whatever leader is set in the cluster
- cluster with leader exists, instance with different leader already set connects -> new election is held
During an election each instance just submits a vote for the node that has the highest weight from the ones that it knows of locally. Applying this to your scenario, I suspect that your bedroom node was already running and elected itself as leader when livingroom connected. As a quick fix you can try to shutdown both instances, then start livingroom. Once that's done you can start bedroom. Both should now have livingroom as the leader.
Ahh, thanks for the explanation. The issue for me is that the raspberry pi doesn't have the best connection (ssh is really slow) so I suspect that I'm not getting updates when it's the leader. I can force livingroom to be the leader by restarting the pi but my livingroom server also restarts from time to time (updates) so it would be nice if it didn't lose its leadership position because of that.
Would it make sense to change the propofol and elect a leader every time a node joins a cluster?
The issue with that would likely be random state changes, as an instance starts with an empty state. Once an instance is elected as leader it will force the entities to match its own local state - if the state hasn't regenerated to the right level yet on the instance you will see random blips of wrong states with the restarts.
Aside from that, if your instance reconnects within cluster.timeout
the cluster should not change leaders.
Couldn't there be some initialization protocol where a new node that becomes the new leader initializes its state before taking over as the leader?
The issue in my case is that the bedroom node has bad wifi and so I don't want it to be the leader.
I understand that my use case is maybe not the target use case so feel free to close this issue as wontfix
but maybe my feedback is useful for future versions of room assistant.
There probably could be - and at the very least we should handle these kinda scenarios better. I'll keep this open for tracking. Maybe I can think of a good solution!
There hasn't been any activity on this issue recently. In an effort to provide a better overview of current issues we automatically clean some of the old ones. Many of them may already be resolved in newer versions of room-assistant. This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Please don't close this issue.
I believe the enableStrictWeightMode
option introduced in https://github.com/goldfire/democracy.js/pull/18 could resolve this issue