bdit_data-sources icon indicating copy to clipboard operation
bdit_data-sources copied to clipboard

Automate data quality checks for VDS schema

Open gabrielwol opened this issue 1 year ago • 3 comments

Build on work done in #605, #626 to automate data quality checks for old rescu schema into vds schema. New schema contains more disaggregate RESCU data which allows us to expand checks to individual lane level.

See also: #589

gabrielwol avatar Jul 28 '23 16:07 gabrielwol

Brainstorm from 2020 (https://www.notion.so/bditto/RESCU-20s-Data-Processing-Work-Plan-90c45f8549ed483092e0ee4732678e20) proposes:

  • view which automatically fills in gaps in data (both at 20sec and 15min levels).
  • view of inactive detectors to feed into maintenance
  • a list of construction events impacting detectors
  • a list of exclusions based on validation concerns (similar to #589)

gabrielwol avatar Jul 28 '23 19:07 gabrielwol

Idea from @scann0n's #605 work is to use thresholds based on "good days" (all 96 15min bins present) to determine good volume days for each sensor. In order to automate we may want to use a rolling time horizon as volumes are always changing, and consider individual sensor thresholds rather than single values for each highway.

gabrielwol avatar Jul 28 '23 20:07 gabrielwol

From #617 consider including network_outage and individual_rescu_outage tables.

gabrielwol avatar Jul 28 '23 21:07 gabrielwol