lockwise-android
lockwise-android copied to clipboard
Handle SyncTelemetryPing in DataStore
Requires #865
#865 updated AS and A-C to the latest versions with all required Megazord changes. The sync
call in DataStore.kt
now returns a SyncTelemetryPing
. This should be returned in the case of a successful completion.
Todo
- [ ] open question: in which cases will we receive the
SyncTelemetryPing
? Only on a successful sync? - [ ] send
SyncTelemetryPing
asvalue
orextra
in all applicable telemetry events
@irrationalagent
a) do we still need this? especially considering #862— TL;DR: we found the missing sync_start
events!— and that we're not able to add a sync id to the sync start, only successful sync_end
s.
b) do we need one per SyncTelemetryPing
? i.e. uid
; or the one per SyncInfo
?
c) where should this go in the event ping? as a value
or an extra
?
Sorry for the late response on this. I'm a bit out of the loop here, so bear with me.
it looks like you have the whole sync ping payload accessible here? if that's the case its going to be a pretty gnarly bit of json. I'm reasonably certain it will be too long to fit into the extra field of an event, and I'm not sure we would want to do that anyway (it def wont fit in the value
field).
So, if you found the sync start events that would be a good start towards helping us understand frequency of syncing and its success rate, and depending on a couple things that might be all we need here. a few questions about that:
- should there be at most one one sync start event per sync_end/sync_timeout/sync_error event
- conversely: for every sync start, will there always be at least one corresponding sync_end/sync_timeout/sync_error event? if not, could we know with some certainty what a missing "sync end" event would mean?
- would sync_end fire only on success, or would it fire on failure/error as well? e.g. would we expect a sync error event to generate both sync_end and sync_error, or just the latter?
Without a sync_id we would need to be sure that the above conditions hold in order for the sync events to be useful in analysis. What I'm getting at - we need to ensure that we (somewhat) reliably have a one-to-one mapping of events indicating sync is starting to events indicating its done (whether successfully or not). If we had a sync_id, meeting these requirements would be less important.
IF we can ensure that ^ then I think we will at least have enough data to understand high-level sync success and failure rates. Of course, we still won't know why. some of that info might be in the sync ping, e.g. engine_failure_reason
. But we would need to get that out somehow and into the existing sync_error events somehow, e.g. into their extra
field.
@irrationalagent
My read of the code is that:
-
count(sync_start)
==count(sync_end) + count(sync_error)
. If and only if exactly-one-(sync_end
ORsync_error
) is produced for eachsync_start
, unless the app is killed before sync stops. - we do not have a specific
sync_timeout
events. We have removed this as we were generating our own timeouts, not relying on the megazord to do it. - the
sync_error
has avalue
of unlocalized detail message of the exception. - we only get a
SyncTelemetryPing
on a successful sync (i.e. async_end
). Otherwise we get anull
. - we, i.e. lockwise, could generate a UUID for each sync start, and pair it with
sync_end
andsync_error
. However, this is platform specific work. It isn't an insignificant amount of work: the telemetry forsync_start
happens someway away from the actual sync, so their relative positions would have to be rethought.
Perhaps we should re-look at this once the next release has bedded in a little? (it's just gone out)