webpa-common
webpa-common copied to clipboard
fixed bug where duplicated device would remove itself and the newly added device
fixes #438
When there's a new connection request from a device with an id that already exists, the existing device is replaced by the new one. The old device will call requestClose which removes the new device.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 86.76%. Comparing base (
a05ca6d) to head (e42635b). Report is 43 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #439 +/- ##
==========================================
- Coverage 86.80% 86.76% -0.04%
==========================================
Files 187 187
Lines 8201 8207 +6
==========================================
+ Hits 7119 7121 +2
- Misses 881 885 +4
Partials 201 201
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
It looks like if there is a duplicate connected device is found, we leave the previous connection open until it naturally will close down due to inactivity (in the case where a device reconnects after a non-graceful socket shutdown). Or will allow multiple connected devices with the same identifier on each talaria.
This means that at any given moment instead of having a max of 1 per datacenter instance of a duplicate CPE, we could have n if I am understanding the code right. @johnabass can you confirm I understand this right?
The problem detected is that in registry.go the newDevice replaces existing and calls existing.requestClose:
r.data[id] = newDevice
r.count.Set(float64(len(r.data)))
r.lock.Unlock()
if existing != nil {
r.disconnect.Add(1.0)
r.duplicates.Inc()
newDevice.Statistics().AddDuplications(existing.Statistics().Duplications() + 1)
existing.requestClose(CloseReason{Text: "duplicate"})
}
In pumpClose (manager.go) it will remove from the map the device by its id, which will remove newDevice from the map and close both connections:
func (m *manager) pumpClose(d *device, c io.Closer, reason CloseReason) {
// remove will invoke requestClose()
m.devices.remove(d.id, reason)
...
}
It looks like if there is a duplicate connected device is found, we leave the previous connection open until it naturally will close down due to inactivity (in the case where a device reconnects after a non-graceful socket shutdown). Or will allow multiple connected devices with the same identifier on each talaria.
It's not exactly leaving the previous connection open. It's not allowing the "new" same id device to connect again in the first try (the talaria closes both connections, the new and the old one). The next retries work correctly again, because the bug it's not triggered again since both previous connections were closed, therefore with no duplicates.
@diogoViana95 described above that the bug it's related with the adding the "r.data[id] = newDevice" to the dictionary of devices before removing the supposed "oldDevice". Since the removal it's done afterwards by getting by id from the same dictionary, then the pumpClose will trigger the "newDevice" instead of the "oldDevice" (the "oldDevice" it's not in the dictionary of devices at this time).