bdit_data-sources
bdit_data-sources copied to clipboard
Automate data quality checks for VDS schema
Build on work done in #605, #626 to automate data quality checks for old rescu
schema into vds
schema.
New schema contains more disaggregate RESCU data which allows us to expand checks to individual lane level.
See also: #589
Brainstorm from 2020 (https://www.notion.so/bditto/RESCU-20s-Data-Processing-Work-Plan-90c45f8549ed483092e0ee4732678e20) proposes:
- view which automatically fills in gaps in data (both at 20sec and 15min levels).
- view of inactive detectors to feed into maintenance
- a list of construction events impacting detectors
- a list of exclusions based on validation concerns (similar to #589)
Idea from @scann0n's #605 work is to use thresholds based on "good days" (all 96 15min bins present) to determine good volume days for each sensor. In order to automate we may want to use a rolling time horizon as volumes are always changing, and consider individual sensor thresholds rather than single values for each highway.
From #617 consider including network_outage and individual_rescu_outage tables.