data-infra
data-infra copied to clipboard
GTFS RT validation: Investigate `[Errno 2] No such file or directory` issues
[Errno 2] No such file or directory
is our most common cause of validation failures. (In the midnight UTC hour on 6/29/23, occurred for 13 distinct URLs, 4 of them for the entire hour.)
See for example:
SELECT
name,
SUBSTR(validation_exception, 0, 35),
COUNT(*)
FROM `cal-itp-data-infra.staging.stg_gtfs_rt__vehicle_positions_validation_outcomes`
WHERE dt = '2023-06-29' AND hour = '2023-06-29T00:00:00' AND process_stderr IS NULL AND NOT validation_success
GROUP BY 1,2
ORDER BY 3 DESC
The goal of this ticket is to try to investigate why this is happening.
Per @atvaccaro:
IIRC these are situations where the validator is not writing an output JSON file to the expected path