bdit_data-sources
bdit_data-sources copied to clipboard
`wys.aggregate_speed_counts_one_hour_5kph` failing with duplicate key error
See airflow logs. I suspect this occurred because on one of the (many) retries for this date, new data was pulled for a api_id
/ datetime_bin
combo which had previously been aggregated. The aggregate table is not cleared before inserting, so new data (speed_count_uid IS NULL
) matching an existing combination will result in this duplicate key error. Due to #719, the new data could overlap only on api_id
/ datetime_bin
and not speed_id
.
@radumas a temporary fix would be to clear this date:
DELETE FROM wys.speed_counts_agg_5kph
WHERE datetime_bin >= '2023-08-25 00:00:00'::timestamp
AND datetime_bin < '2023-08-25 00:00:00'::timestamp + interval '1 day';
DELETE FROM wys.raw_data_2023
WHERE datetime_bin >= '2023-08-25 00:00:00'::timestamp
AND datetime_bin < '2023-08-25 00:00:00'::timestamp + interval '1 day';
And re-run via airflow UI.
Deleted and task cleared. Thanks Gabe!