bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

Miovision - Missing Data + Cameras Down

Open aharpalaniTO opened this issue 3 years ago • 6 comments

Three issues here:

  1. There are a number of intersections which need to be backfilled with data (outlined here: #416)
  2. There are intersections that are currently down.
  3. There are intersections that need to be added to miovision.intersections

1. Intersections requiring backfill

List of intersections that should likely be re-pulled via API based on difference between install date and first available date of data:

WITH earliest_date AS (

	SELECT intersection_uid, MIN(datetime_bin)::date AS dt
	FROM miovision_api.volumes_15min
	GROUP BY intersection_uid
)

SELECT intersection_uid, intersection_name, dt AS date_available, date_installed
FROM miovision_api.intersections
INNER JOIN earliest_date USING (intersection_uid)
ORDER BY intersection_uid
intersection_uid intersection_name date_available date_installed
58 Bloor Street West and Dufferin Street 06/21/2021 12/22/2020
59 Eglinton Avenue West and Jane Street 06/21/2021 06/02/2021
60 Harbord Street/St. George Street/Hoskin Street 06/21/2021 05/14/2021
61 Jane Street and Lawrence Avenue West 06/21/2021 06/08/2021
62 Jane Street and Wilson Avenue 06/21/2021 06/09/2021
63 Sheppard Avenue West and Weston Road 06/21/2021 05/14/2021
64 Yonge Street / Davenport Road / Church St 06/21/2021 05/14/2021

2. Intersections currently down (based on Trafficlink and data in RDS):

  • Jane and Steeles: down as of August 6, 2021 (unsure when installed)
  • Jane and Eglinton: down as of July 2, 2021 (installed on June 2, 2021)
  • Wellington and Bay: down as of April 3, 2021

3. Intersections not in database

  • Jane and Steeles
  • Bayview and River

aharpalaniTO avatar Oct 25 '21 17:10 aharpalaniTO

When running the intersection_tmc script to populate volumes for intersections 67 (Sheppard & Keele) and 68 (Steeles & Jane), the volumes table is not populated and the script outputs "INFO Sheppard Avenue West and Keele Street not active on 2021-06-16 06:00:00" For all times within the selected period (one day from their install date). Likely the cameras were not active. @radumas @chmnata

tankedman avatar Aug 18 '22 22:08 tankedman

@tankedman Archived documentation is out of date, please run the aggregation part with the following functions in order:

miovision_api.find_gaps
miovision_api.aggregate_15_min_mvt
miovision_api.aggregate_15_min
miovision_api.report_dates

chmnata avatar Aug 19 '22 19:08 chmnata

Documenting my confusion here so we can discuss on Monday: Jedwin's nohup.out shows that miovision_api.aggregate_15_min_mvt is trying insert a duplicated value

ERROR:  duplicate key value violates unique constraint "volumes_15min_mvt_intersection_uid_datetime_bin_classificat_key"
DETAIL:  Key (intersection_uid, datetime_bin, classification_uid, leg, movement_uid)=(1, 2022-07-20 23:00:00, 1, E, 1) already exists.

But when I check the volumes table, there are no rows with volume_15min_mvt_uid is null in the table, which means this shouldn't be aggregated at all since there is a where clause in the function that only runs volumes where volume_15min_mvt_uid is null is null.

select * from  miovision_api.volumes
where intersection_uid = 1 and datetime_bin::Date = '2022-07-20' and volume_15min_mvt_uid is null 

There is no parameter to specific intersections to aggregate, meaning the function should know not to aggregate and insert data that was already aggregated (aka having a volume_15min_mvt_uid)

chmnata avatar Aug 26 '22 22:08 chmnata

So there is indeed a bug in the aggregation function. Because of the 0-padding of null-volumes it will try to insert a 0 where there might already be 0 data.

https://github.com/CityofToronto/bdit_data-sources/blob/master/volumes/miovision/sql/function-aggregate-volumes_15min_mvt.sql#L47-L49

That is because the aggregation query starts with the cross product of intersection X time-bins (from generate_series) for the date(s) to aggregate. It then left joins on existing volumes excluding A.volume_15min_mvt_uid IS NULL, but there are two conditions in which that field may be NULL:

  1. the row exists and the column is null
  2. the row doesn't exist

There's a subsequent HAVING clause that should exclude 0-volumes from being inserted... except for the classifications we deem to be important.

I can think of two solutions:

  1. add an anti-join to this insert query to exclude intersection X time-bins that have already been populated (and I guess stop relying on this volume_15min_mvt_uid foreign key relationship
  2. Insert 0s into volumes where they've been zero-padded.

Deleting volumes for specific intersections to re-run the CLI won't work because this function still zero-pad the other intersections on that date

radumas avatar Aug 29 '22 18:08 radumas

As discussed with @radumas, we can add a filter for the intersection_uid we are aggregating for in the function's sql and run that first . @tankedman wanna try creating the query?

chmnata avatar Aug 29 '22 19:08 chmnata

Agg. process complete for intersections 65 and 66 from 07-21-2022 to 08-15-2022. Next step is to backfill to start date 06-16-2021

tankedman avatar Sep 02 '22 20:09 tankedman