bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

Miovision: discrepencies between volumes_15min_mvt and unacceptable_gaps

Open gabrielwol opened this issue 1 year ago • 1 comments

          Here are 12 dates with records that should be nulls but are not. You were lucky to pick one of them at random! I will create a new issue for this investigation. 
"datetime_bin" "count"
"2024-02-05" 46
"2024-02-01" 43
"2024-01-20" 3990
"2024-01-19" 4032
"2024-01-18" 4032
"2024-01-17" 4032
"2024-01-16" 4032
"2024-01-15" 4032
"2024-01-14" 4032
"2024-01-13" 4032
"2024-01-02" 44
"2023-01-22" 42
SELECT datetime_bin::date, COUNT(*)
FROM miovision_api.volumes_15min_mvt AS v
JOIN miovision_api.unacceptable_gaps AS un USING (intersection_uid, datetime_bin)
WHERE v.volume IS NOT NULL
GROUP BY 1
ORDER BY 1 DESC

Originally posted by @gabrielwol in https://github.com/CityofToronto/bdit_data-sources/issues/863#issuecomment-1938939546

gabrielwol avatar Feb 12 '24 16:02 gabrielwol

All these discrepencies were caused by 5 unacceptable_gap entries:

"dt" "intersection_uid" "gap_start" "gap_end"
"2023-01-22" 15 "2023-01-22 23:57:00" "2023-01-23 00:04:00"
"2024-01-02" 60 "2024-01-02 10:21:00" "2024-01-03 00:00:00"
"2024-01-12" 6 "2024-01-13 00:00:00" "2024-01-20 23:45:00"
"2024-02-02" 7 "2024-02-01 23:45:00" "2024-02-02 22:57:00"
"2024-02-05" 41 "2024-02-05 23:52:00" "2024-02-06 00:17:00"
SELECT DISTINCT un.dt, un.intersection_uid, un.gap_start, un.gap_end
FROM miovision_api.volumes_15min_mvt AS v
JOIN miovision_api.unacceptable_gaps AS un USING (intersection_uid, datetime_bin)
WHERE v.volume IS NOT NULL
    AND v.datetime_bin >= '2023-01-02' AND v.datetime_bin < '2024-02-06'

The first/fourth/fifth record above are because the find_gaps script identifies gaps that overlap midnight, but started the day before. I didn't anticipate that this would make the pipeline non-idempotent. I will remove this behaviour now at the minor expense of missing small gaps that start over midnight.

The other two issues I can't think of a cause for! I can re-aggregate these 2 ranges and we can keep our eyes out for similar issues in the future.

gabrielwol avatar Feb 12 '24 19:02 gabrielwol