bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

Miovision: Duplicates in volumes_15min_mvt

Open gabrielwol opened this issue 1 year ago • 1 comments

This would have stemmed from changes made in https://github.com/CityofToronto/bdit_data-sources/pull/735. I'm guessing the duplicates were then inserted when adding new intersections by accidentally running two run-api commands in parallel (1) clear - (2) clear - (1) agg - (2) agg .....
Issue: the existing primary key (below) can't act as a unique constraint because of the presence of volume_15min_mvt_uid. Since we already removed the foreign key referencing this column, it should be safe to remove it from the primary key.

volumes_15min_mvt_int_uid_dt_bin_class_leg_mvmt_uid_pkey PRIMARY KEY (volume_15min_mvt_uid, intersection_uid, datetime_bin, classification_uid, leg, movement_uid)

The following ranges are duplicate on intersection_uid, datetime_bin, classification_uid, leg, movement_uid and will need to be re-aggregated:

"intersection_uid" "min" "max"
69 "2023-08-10 00:15:00" "2023-11-28 23:45:00"
70 "2023-08-10 00:15:00" "2023-11-28 23:45:00"

gabrielwol avatar Feb 13 '24 20:02 gabrielwol

Running this to fix the duplicates:

python3 intersection_tmc.py run-api --intersection 69 --intersection 70 --start_date 2023-08-10 --end_date 2023-11-29 --agg --path /data/airflow/data_scripts/volumes/miovision/api/config.cfg

gabrielwol avatar Feb 13 '24 20:02 gabrielwol