bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

Missing Miovision Dates

Open radumas opened this issue 2 years ago • 3 comments

  • [ ] January 27, 2022
  • [x] June 29, 2022

at least for the RapidTO locations

radumas avatar Aug 10 '22 15:08 radumas

Backfilling for January 17, 2022 returned the below error:

[2022-08-11 09:44:03,121] {bash_operator.py:157} INFO - 11 Aug 2022 09:44:03     	INFO    Bayview Avenue and River Street     2022-01-27 06:00:00
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - 11 Aug 2022 09:44:03     	CRITICAL    Traceback (most recent call last):
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 233, in get_road_class
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     return self.roaduser_class[ru_class]
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - KeyError: 'Pedestrian'
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - 
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - During handling of the above exception, another exception occurred:
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - 
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO - Traceback (most recent call last):
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 99, in run_api
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     pull_data(conn, start_time, end_time, intersection, path, pull, key, dupes)
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 463, in pull_data
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     c_start_t, c_end_t)
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 301, in get_intersection
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     table_veh = self.process_response('tmc', response_tmc)
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 296, in process_response
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     + self.process_tmc_row(row) for row in data]
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 296, in <listcomp>
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     + self.process_tmc_row(row) for row in data]
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 274, in process_tmc_row
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -     classification = self.get_road_class(row)
[2022-08-11 09:44:03,679] {bash_operator.py:157} INFO -   File "/etc/airflow/data_scripts/volumes/miovision/api/intersection_tmc.py", line 236, in get_road_class
[2022-08-11 09:44:03,680] {bash_operator.py:157} INFO -     .format(row['class']))
[2022-08-11 09:44:03,680] {bash_operator.py:157} INFO - ValueError: vehicle class Pedestrian not recognized!

chmnata avatar Aug 11 '22 15:08 chmnata

Considering we don't need that location for RapidTO... wondering if we should run this manually and exclude that intersection, or run it manually for allllll the other intersections.

I can walk someone through text processing to create the command line options for all intersections if someone is stumped

radumas avatar Aug 15 '22 13:08 radumas

Backfilled for Jan 27 for all other intersection except uid 65, tried re-pulling for 65 after miovision rolling back the config and still got the same error

chmnata avatar Aug 19 '22 19:08 chmnata

@tankedman can you create a new script on top of the original miovision pulling script so we run a range of date (start_date, end_date) that: - log the days that has 'pedestrian' error, - log the row that returned the error from the api - then skip the day that has error and continue to pull the next day

So that we can at least pull the days that we can pull :meow_dio:

chmnata avatar Jan 31 '23 15:01 chmnata

The "Pedestrian" Error movements are now available in my schema, under jmok.miov_pederr .

Additionally, the backfill script has now pulled all non-error movements from 2021-06-16 to 2022-05-31.

tankedman avatar Feb 21 '23 21:02 tankedman

Thanks for putting together the table, @tankedman

Looking at the log file from your processes, there are 484 counts of warning! you are trying to insert duplicate data throughout different days, starting aug 21 2021 till the last day pulled may 30 2022.

For example, the msg is saying that at 2021-08-21 11:59:00, classification_id: 10, leg: S, movement_uid: 7, and volume: 1 is duplicated. Which in this case, it is indeed duplicated as the row inserted first is exactly the same, as with 483/484 of the duplicated rows showed in the log. I've checked some recent airflow miovision pulling logs and did not see similar warnings.

16 Feb 2023 21:23:54     	INFO    Bayview Avenue and River Street     2021-08-21 06:00:00
16 Feb 2023 21:23:56     	INFO    Completed data pulling from 2021-08-21 06:00:00 to 2021-08-21 12:00:00
16 Feb 2023 21:23:57     	WARNING    WARNING:  You are trying to insert duplicate data! 
 65, 2021-08-21 11:59:00, 10, S, 7, 1

image

However, there is ONE case where we got duplicated data for the same datetime, classification_id, movement_id, but with a different volume :meow_dio:

select * 
from  natalie.miovision_err_dups  dups
inner join miovision_api.volumes using (intersection_uid, datetime_bin, classification_uid, leg, movement_uid)
where dups.volume != volumes.volume

The magical day of 2021-11-07 01:59:00 , inserted data shows the volume is 3, but the log says its ONE image

65, 2021-11-07 01:59:00, 1, W, 2, 1

We should ask miovision about this on top of the peds error :meow_dead:

Peds error summary:

  • 290 days out of 349 pulled between the date of 2021-06-16 to 2022-05-31 returned at least 1 error from pedestrian class, ranging from 18 rows to 232!

chmnata avatar Feb 24 '23 22:02 chmnata

@radumas is this still fixable? Given https://github.com/CityofToronto/bdit_data-sources/issues/287#issuecomment-1677316909 it would seem like this is now historical data that is unlikely to be recoverable?

Nate-Wessel avatar Aug 15 '23 16:08 Nate-Wessel

Different error

radumas avatar Aug 15 '23 20:08 radumas

Added in anomalous_ranges uid: 897

chmnata avatar Feb 14 '24 20:02 chmnata