telemetry-ios
telemetry-ios copied to clipboard
Missing core ping research mega thread
We can use this as a thread to track the progress of this work, or drop general notes about this task in here.
Hopefully today or early tomorrow we can land these issues, which cover all the known issues that relate to this:
- issue #40
- issue #20
- issue #39
Dashboard for monitoring missing ping sequence numbers: https://sql.telemetry.mozilla.org/dashboard/core-ping-sequence-numbers-count-per-day
The docs on creating the above graph: : https://docs.telemetry.mozilla.org/cookbooks/realtime_analysis_plugin.html
Recommended we add Fields[docType] == ‘core’
to isolate to core pings.
Additional useful dashboards:
https://sql.telemetry.mozilla.org/queries/5548#11264
https://sql.telemetry.mozilla.org/dashboard/ios-focus
https://sql.telemetry.mozilla.org/dashboard/executive-dashboard-mobile-users-engagement
select * from default.mobile_clients_v2 limit 1
https://pipeline-cep.prod.mozaws.net/dashboard_output/graphs/analysis.frank.focus_event_ping_count_by_os.messages_per_minute.html
I modified the missing ping query for Beta-only: https://sql.telemetry.mozilla.org/queries/47290/source#128091
Here is how it looks
We see a huge spike in successful pings, and the missing pings line staying roughly constant (for the past few days). Ideally, the missing pings would drop to zero, but I may have to filter out non-latest builds which are adding to the missing ping counts.
Activity Stream DAU/MAU graph for comparison (release and beta): https://sql.telemetry.mozilla.org/dashboard/activity-stream-ios-release-metrics-summary https://sql.telemetry.mozilla.org/dashboard/activity-stream-ios-metrics-summary-beta-
UT query for Beta DAU: https://sql.telemetry.mozilla.org/queries/47344/source#table
Today's results show trend indicates more healthy pings:
Beta DAU accuracy got better also (see bottom 2 rows vs upper rows):
Tip: restrict data to build id with telemetry_core_parquet
, example for 7227: metadata.app_build_id='7227'
Build 7227 went from 1% missing pings and increased to 9% over 3 days:
Downloading the data from build 7473 and manually analyzing it, I see 25 missing ping sequences only:
missing 2903B2D6-8BB7-4C42-BCC4-9223B5043F7B, jumped from 142 to 147
missing 2903B2D6-8BB7-4C42-BCC4-9223B5043F7B, jumped from 147 to 163
missing 298984F1-D6F6-47B9-877C-6DC372D8A406, jumped from 202 to 211
missing 298984F1-D6F6-47B9-877C-6DC372D8A406, jumped from 211 to 226
missing 2AA689D4-64D6-4E1A-B6C0-3A134BA96D7C, jumped from 105 to 112
missing 2ADA0C46-1F72-4BEB-8C94-5707C947D63B, jumped from 100 to 107
missing 3F2D3730-FF78-48DA-9C16-F274BA861A90, jumped from 131 to 139
missing 448796B7-4FB7-4D74-89CC-785469072A61, jumped from 172 to 211
missing 85296420-6032-4BFC-93CB-AA93C128C964, jumped from 200 to 247
missing 859F5AB9-89CB-4509-ADB8-5836862BD046, jumped from 109 to 114
missing A377DF20-8528-4A4E-A2FF-BBD4F4C31B60, jumped from 135 to 156
missing A97B5152-224A-4144-9D40-5E4CB975C555, jumped from 100 to 104
missing ABB0B75E-CA80-42FF-B84F-4D7A0530EE75, jumped from 100 to 114
missing BA70B1F7-E1B2-4505-9C64-8B6DE5781CDA, jumped from 166 to 179
missing BB4156CE-B768-497B-B047-FFD95D9EC86B, jumped from 130 to 132
missing C4ED2A6A-2C56-4F80-9FE2-BB31B5DDCDF3, jumped from 101 to 108
missing C7E83F8B-C642-4B88-BC39-DEB4BDD17F9F, jumped from 101 to 103
missing E342D358-FB23-4273-92A2-C56E592F2CA4, jumped from 179 to 192
missing ED909420-96F1-4A2E-B792-3F7D8A67F727, jumped from 103 to 111
missing EF3E375B-BE1D-44D1-9CCC-9D9B41CA8762, jumped from 167 to 193
missing F877CD84-807D-4509-92E6-9CB06C49ACEF, jumped from 113 to 124
missing FD3A5951-2797-4984-84BD-AA6033D71D2A, jumped from 291 to 333
This is the data query:
SELECT
client_id,
submission_date_s3,
seq
FROM
telemetry_core_parquet
WHERE
app_name = 'Fennec'
AND os = 'iOS'
AND CHANNEL='beta'
AND metadata.app_build_id='7473'
AND submission_date_s3 < date_format(current_date, '%Y%m%d')
And this is the python to find missing pings:
import sys
d = {}
with open(sys.argv[1]) as f:
for line in f:
(key, date, sval) = line.split(',')
val = int(sval)
if key in d:
if val > d[key] + 1:
print('missing ' + key + ', jumped from ' + str(d[key]) + ' to '+ str(val))
d[key] = val
Looks like we both put a little script together for this ... my result is pretty similar. Same input query actually.
sarentz@Risa ~> ./sequence.py ~/Downloads/New_Query_2017_10_31.csv
Total clients : 842
Total pings : 6870
Clients with duplicates : 23 (2.73)
Missing pings : 39 (0.57)
Clients with missing pings : 29 (3.44)
https://gist.github.com/st3fan/4404ff72e5883e455865bf745ecd165b
Here is the reported errors and their counts in Sentry for build 7473 (. These are all normal URL connection errors and don't cause ping deletion:
-1009: 3
-1001: 3
310: 2
-1202: 3
The request timed out.: 2
A server with the specified hostname could not be found.: 3
The Internet connection appears to be offline.:62
An SSL error has occurred and a secure connection to the server cannot be made.:22
The last ones are reported with no error code, so I print the error message.
310 errors are discussed here: https://github.com/mozilla-mobile/telemetry-ios/issues/73
DAU for 10.0: https://sql.telemetry.mozilla.org/queries/47344/source#table