misterhouse
misterhouse copied to clipboard
PLM Acts Strangely After Linking to New I2CS Device
I am filing this for the issue mentioned in the mailing list here: http://misterhouse.10964.n7.nabble.com/Strange-PLM-Error-After-Adding-a-New-I2CS-Device-td19763.html
While no other user has reporting seeing this issue, I am able to reliably reproduce this in my environment so I wanted to open an issue for it, even if it is eventually closed as "wontfix" or "can't reproduce"
I am guessing that this is either an issue with the I2CS linking routine code or the PLM firmware itself. If it is the PLM, it may be possible that only some versions of PLM firmware will show this behavior.
To reproduce:
- Create a few internal PLM scenes and sync all device links for these
- Add a brand new I2CS device to the network and run a "link to interface" on it
- Activate one of the PLM scenes and notice strange behavior from the PLM
- The PLM will report traffic that should not be generated by activating the scene
- Some PLM functions will no longer work properly
- It seems that multigroup devices such as RemoteLincs and KeypadLincs appear more frequently in the 'phantom' traffic that the PLM reports
- A power-cycle of the PLM is the only way to exit from this seemingly corrupted state
Sorry @JaredF, I have still yet to try and reproduce this on my end. I will try and look into it soon.
No worries. It only happens when I link new I2CS devices, so it only affects me from time to time. I just wanted to make a record of it here for later follow-up.
@JaredF, I just added 4 new devices, moved device names between two keypad switchlincs (which I really expected to fail but thanks to Kevin this succeeded), and added the new devices to 3 different scenes. All succeeded. MH activated the scenes this morning with cron timers and then I tried manually turning one scene on/off with the web interface. I have not noticed any strange behavior yet. It seems like your condition above is reproducible for you so we should be able to reproduce it as well. I realized that some of the descriptions you have are a little ambiguous. It might help to add some more details to 1. i.e. exactly what do you add when you reproduce the problem and then explain exactly how you activate one of the scenes in 2. Finally how do you observe the phantom traffic (maybe it is happening and I don't notice).
Ah. I think I reproduced, by dumb luck, the right conditions to cause some issues with both MH and the PLM while linking a new device. I tried to link a device that is a close to the edge of the signal range so there were a few retransmissions, etc. that might have helped expose the issue but this is just a guess. This is a gist of the entire log: https://gist.github.com/mstovenour/1cbe4c72d6fa591e2ba5
I've pulled out a few interesting events: The log shows:
- PLM linked to device as controller
- device aldb read (successful)
- device status request (successful)
- device write aldb (fails)
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] The link table for $accent_bed4 is in sync.
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] DEBUG4: Found empty address [0fef] in empty array
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] DEBUG2: adding link record to $accent_bed4 light level controlled by $PLM and group: 01 with on level: 100, ramp rate: 0.1, local load(data3): 01
11/24/14 09:04:28 PM [Insteon::ALDB_i2] $accent_bed4 writing address: 0fef for device: 1edc30 and group: 01
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG2: Sending obj=$accent_bed4; command=read_write_aldb; extra=0000020fef08E2011edc3003000100 incurred delay of 0.05 seconds; starting hop-count: 1
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Sending PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b8
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG4:
PLM Command: (0262) insteon_send
To Address: 26:ad:33
Message Flags: 15
Message Type: (000) Direct Message
Message Length: (1) Extended Length
Hops Left: 1
Max Hops: 1
Insteon Message: 2f0000020fef08e2011edc30030001b8
Cmd 1: (2f) Read/Write ALL-Link Database
Cmd 2: (00) Command in D2
D1-D14: 00020fef08e2011edc30030001b8
- PLM seems to send corrupted send-ACK
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206
- MH stalls because the message doesn't match
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 025026ad331edc30a72ffd
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206025026ad331edc30a72ffd
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206025026ad331edc30a72ffd
- finally write aldb fails
11/24/14 09:04:30 PM [Insteon::BaseObject] WARN!! encountered a nack message (bad_checksum_or_unknown_cmd) for $accent_bed4. It may be unplugged, have a burned out bulb, or this may be a new I2CS type device that must first be manually linked to the PLM using the set button.
11/24/14 09:04:30 PM [Insteon::BaseObject] WARN: Now calling message failure callback: ::print_log("[Insteon::BaseInsteon] Error: The Link_To_Interface routine failed for device: $accent_bed4")
11/24/14 09:04:30 PM [Insteon::BaseInsteon] Error: The Link_To_Interface routine failed for device: Insteon::LampLinc=HASH(0xa56552c)
*Stuck message stays in queue until later when I hit a KPL button unrelated to the device being synced. (there was no other activity between 9:05pm and 11:05pm)
11/24/14 09:04:30 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:30 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 025026ad331edc30212f00
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 02502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f0002502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: ERROR: Could not process message. Removing stale data from queue.
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 025026ad331edc30212f0002502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG4:
PLM Command: (0250) insteon_received
From Address: 26:ad:33
To Address: 1e:dc:30
Message Flags: 21
Message Type: (001) ACK of Direct Message
Message Length: (0) Standard Length
Hops Left: 0
Max Hops: 1
Insteon Message: 2f00
Cmd 1: (2f) Light OFF at Ramp Rate
Cmd 2: (00) Ramp Rate
11/24/14 11:05:32 PM [Insteon::BaseInterface] DEBUG3: PLM command:insteon_received; Device command:read_write_aldb; type:direct; group:
11/24/14 11:05:32 PM [Insteon::BaseInterface] WARN: received insteon ACK/NACK message from $accent_bed4 but cannot correlate to sent message! IGNORING received message!!
Ahh more Insteon fun.
@mstovenour , I agree with your assessment of your logs. I am not sure if there is anything else MH could do. Best I can tell, MH acts reasonably when it receives the bad data.
Thissounds slightly different than what @JaredF is seeing, but they could be related. @JaredF since you can repeat your error at will, can you create a detailed error log for us?