misterhouse icon indicating copy to clipboard operation
misterhouse copied to clipboard

PLM Acts Strangely After Linking to New I2CS Device

Open JaredF opened this issue 10 years ago • 5 comments

I am filing this for the issue mentioned in the mailing list here: http://misterhouse.10964.n7.nabble.com/Strange-PLM-Error-After-Adding-a-New-I2CS-Device-td19763.html

While no other user has reporting seeing this issue, I am able to reliably reproduce this in my environment so I wanted to open an issue for it, even if it is eventually closed as "wontfix" or "can't reproduce"

I am guessing that this is either an issue with the I2CS linking routine code or the PLM firmware itself. If it is the PLM, it may be possible that only some versions of PLM firmware will show this behavior.

To reproduce:

  1. Create a few internal PLM scenes and sync all device links for these
  2. Add a brand new I2CS device to the network and run a "link to interface" on it
  3. Activate one of the PLM scenes and notice strange behavior from the PLM
    • The PLM will report traffic that should not be generated by activating the scene
    • Some PLM functions will no longer work properly
    • It seems that multigroup devices such as RemoteLincs and KeypadLincs appear more frequently in the 'phantom' traffic that the PLM reports
  4. A power-cycle of the PLM is the only way to exit from this seemingly corrupted state

JaredF avatar Jul 27 '14 08:07 JaredF

Sorry @JaredF, I have still yet to try and reproduce this on my end. I will try and look into it soon.

krkeegan avatar Jul 29 '14 21:07 krkeegan

No worries. It only happens when I link new I2CS devices, so it only affects me from time to time. I just wanted to make a record of it here for later follow-up.

JaredF avatar Jul 29 '14 21:07 JaredF

@JaredF, I just added 4 new devices, moved device names between two keypad switchlincs (which I really expected to fail but thanks to Kevin this succeeded), and added the new devices to 3 different scenes. All succeeded. MH activated the scenes this morning with cron timers and then I tried manually turning one scene on/off with the web interface. I have not noticed any strange behavior yet. It seems like your condition above is reproducible for you so we should be able to reproduce it as well. I realized that some of the descriptions you have are a little ambiguous. It might help to add some more details to 1. i.e. exactly what do you add when you reproduce the problem and then explain exactly how you activate one of the scenes in 2. Finally how do you observe the phantom traffic (maybe it is happening and I don't notice).

mstovenour avatar Nov 23 '14 13:11 mstovenour

Ah. I think I reproduced, by dumb luck, the right conditions to cause some issues with both MH and the PLM while linking a new device. I tried to link a device that is a close to the edge of the signal range so there were a few retransmissions, etc. that might have helped expose the issue but this is just a guess. This is a gist of the entire log: https://gist.github.com/mstovenour/1cbe4c72d6fa591e2ba5

I've pulled out a few interesting events: The log shows:

  • PLM linked to device as controller
  • device aldb read (successful)
  • device status request (successful)
  • device write aldb (fails)
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] The link table for $accent_bed4 is in sync.
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] DEBUG4: Found empty address [0fef] in empty array
11/24/14 09:04:28 PM [Insteon::AllLinkDatabase] DEBUG2: adding link record to $accent_bed4 light level controlled by $PLM and group: 01 with on level: 100, ramp rate: 0.1, local load(data3): 01
11/24/14 09:04:28 PM [Insteon::ALDB_i2] $accent_bed4 writing address: 0fef for device: 1edc30 and group: 01
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG2: Sending obj=$accent_bed4; command=read_write_aldb; extra=0000020fef08E2011edc3003000100 incurred delay of 0.05 seconds; starting hop-count: 1
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Sending  PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b8
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG4:
         PLM Command: (0262) insteon_send
              To Address: 26:ad:33
           Message Flags: 15
                Message Type: (000) Direct Message
              Message Length: (1) Extended Length
                   Hops Left: 1
                    Max Hops: 1
         Insteon Message: 2f0000020fef08e2011edc30030001b8
                       Cmd 1: (2f) Read/Write ALL-Link Database
                       Cmd 2: (00) Command in D2
                      D1-D14: 00020fef08e2011edc30030001b8
  • PLM seems to send corrupted send-ACK
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:28 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206
  • MH stalls because the message doesn't match
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 025026ad331edc30a72ffd
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc0001b800d206025026ad331edc30a72ffd
11/24/14 09:04:29 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc0001b800d206025026ad331edc30a72ffd
  • finally write aldb fails
11/24/14 09:04:30 PM [Insteon::BaseObject] WARN!! encountered a nack message (bad_checksum_or_unknown_cmd) for $accent_bed4.  It may be unplugged, have a burned out bulb, or this may be a new I2CS type device that must first be manually linked to the PLM using the set button.
11/24/14 09:04:30 PM [Insteon::BaseObject] WARN: Now calling message failure callback: ::print_log("[Insteon::BaseInsteon] Error: The Link_To_Interface routine failed for device: $accent_bed4")
11/24/14 09:04:30 PM [Insteon::BaseInsteon] Error: The Link_To_Interface routine failed for device: Insteon::LampLinc=HASH(0xa56552c)

*Stuck message stays in queue until later when I hit a KPL button unrelated to the device being synced. (there was no other activity between 9:05pm and 11:05pm)

11/24/14 09:04:30 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:30 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 025026ad331edc30212f00
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc30030001b806
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 09:04:31 PM [Insteon_PLM] DEBUG3: Saving data fragment: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Received PLM raw data: 02502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Prepending prior data fragment: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f00
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 026226ad33152f0000020fef08e2011edc30030001b806025026ad331edc30212f0002502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: ERROR: Could not process message. Removing stale data from queue.
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG3: Processing PLM raw data: 025026ad331edc30212f0002502042d3000006cb1300
11/24/14 11:05:32 PM [Insteon_PLM] DEBUG4:
         PLM Command: (0250) insteon_received
            From Address: 26:ad:33
              To Address: 1e:dc:30
           Message Flags: 21
                Message Type: (001) ACK of Direct Message
              Message Length: (0) Standard Length
                   Hops Left: 0
                    Max Hops: 1
         Insteon Message: 2f00
                       Cmd 1: (2f) Light OFF at Ramp Rate
                       Cmd 2: (00) Ramp Rate

11/24/14 11:05:32 PM [Insteon::BaseInterface] DEBUG3: PLM command:insteon_received; Device command:read_write_aldb; type:direct; group:
11/24/14 11:05:32 PM [Insteon::BaseInterface] WARN: received insteon ACK/NACK message from $accent_bed4 but cannot correlate to sent message! IGNORING received message!!

mstovenour avatar Nov 25 '14 14:11 mstovenour

Ahh more Insteon fun.

@mstovenour , I agree with your assessment of your logs. I am not sure if there is anything else MH could do. Best I can tell, MH acts reasonably when it receives the bad data.

Thissounds slightly different than what @JaredF is seeing, but they could be related. @JaredF since you can repeat your error at will, can you create a detailed error log for us?

krkeegan avatar Dec 30 '14 00:12 krkeegan