lockwise-android Handle SyncTelemetryPing in DataStore

Requires #865

#865 updated AS and A-C to the latest versions with all required Megazord changes. The sync call in DataStore.kt now returns a SyncTelemetryPing. This should be returned in the case of a successful completion.

Todo

[ ] open question: in which cases will we receive the SyncTelemetryPing? Only on a successful sync?
[ ] send SyncTelemetryPing as value or extra in all applicable telemetry events

Aug 13 '19 13:08 eliserichards

@irrationalagent

a) do we still need this? especially considering #862— TL;DR: we found the missing sync_start events!— and that we're not able to add a sync id to the sync start, only successful sync_ends. b) do we need one per SyncTelemetryPing? i.e. uid; or the one per SyncInfo? c) where should this go in the event ping? as a value or an extra?

Sep 09 '19 18:09 jhugman

Sorry for the late response on this. I'm a bit out of the loop here, so bear with me.

it looks like you have the whole sync ping payload accessible here? if that's the case its going to be a pretty gnarly bit of json. I'm reasonably certain it will be too long to fit into the extra field of an event, and I'm not sure we would want to do that anyway (it def wont fit in the value field).

So, if you found the sync start events that would be a good start towards helping us understand frequency of syncing and its success rate, and depending on a couple things that might be all we need here. a few questions about that:

should there be at most one one sync start event per sync_end/sync_timeout/sync_error event
conversely: for every sync start, will there always be at least one corresponding sync_end/sync_timeout/sync_error event? if not, could we know with some certainty what a missing "sync end" event would mean?
would sync_end fire only on success, or would it fire on failure/error as well? e.g. would we expect a sync error event to generate both sync_end and sync_error, or just the latter?

Without a sync_id we would need to be sure that the above conditions hold in order for the sync events to be useful in analysis. What I'm getting at - we need to ensure that we (somewhat) reliably have a one-to-one mapping of events indicating sync is starting to events indicating its done (whether successfully or not). If we had a sync_id, meeting these requirements would be less important.

IF we can ensure that ^ then I think we will at least have enough data to understand high-level sync success and failure rates. Of course, we still won't know why. some of that info might be in the sync ping, e.g. engine_failure_reason. But we would need to get that out somehow and into the existing sync_error events somehow, e.g. into their extra field.

Sep 12 '19 15:09 irrationalagent

@irrationalagent

My read of the code is that:

count(sync_start) == count(sync_end) + count(sync_error). If and only if exactly-one-(sync_end OR sync_error) is produced for each sync_start, unless the app is killed before sync stops.
we do not have a specific sync_timeout events. We have removed this as we were generating our own timeouts, not relying on the megazord to do it.
the sync_error has a value of unlocalized detail message of the exception.
we only get a SyncTelemetryPing on a successful sync (i.e. a sync_end). Otherwise we get a null.
we, i.e. lockwise, could generate a UUID for each sync start, and pair it with sync_end and sync_error. However, this is platform specific work. It isn't an insignificant amount of work: the telemetry for sync_start happens someway away from the actual sync, so their relative positions would have to be rethought.

Perhaps we should re-look at this once the next release has bedded in a little? (it's just gone out)

Sep 16 '19 20:09 jhugman

lockwise-android lockwise-android copied to clipboard

Handle SyncTelemetryPing in DataStore

Todo

lockwise-android
lockwise-android copied to clipboard