bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

`wys.aggregate_speed_counts_one_hour_5kph` failing with duplicate key error

Open gabrielwol opened this issue 1 year ago • 2 comments

See airflow logs. I suspect this occurred because on one of the (many) retries for this date, new data was pulled for a api_id / datetime_bin combo which had previously been aggregated. The aggregate table is not cleared before inserting, so new data (speed_count_uid IS NULL) matching an existing combination will result in this duplicate key error. Due to #719, the new data could overlap only on api_id / datetime_bin and not speed_id.

gabrielwol avatar Oct 06 '23 20:10 gabrielwol

@radumas a temporary fix would be to clear this date:

DELETE FROM wys.speed_counts_agg_5kph
WHERE datetime_bin >= '2023-08-25 00:00:00'::timestamp
AND datetime_bin < '2023-08-25 00:00:00'::timestamp + interval '1 day';
DELETE FROM wys.raw_data_2023
WHERE datetime_bin >= '2023-08-25 00:00:00'::timestamp
AND datetime_bin < '2023-08-25 00:00:00'::timestamp + interval '1 day';

And re-run via airflow UI.

gabrielwol avatar Oct 06 '23 21:10 gabrielwol

Deleted and task cleared. Thanks Gabe!

radumas avatar Oct 10 '23 12:10 radumas