sagerx
sagerx copied to clipboard
Weekly data quality checks for mart updates
Depends on #262 and to a lesser extent #266
Problem Statement
Without manual analysis, it's hard to know what changed week to week in our data marts.
Criteria for Success
NDC Description mart
- How many new NDCs
- How many retired NDCs
- How many NDC->RXCUI mappings changed from previous week?
- How many new FDA descriptions
- How many changed FDA descriptions
- How many retired FDA descriptions
- How many new RxNorm descriptions
- How many changed RxNorm descriptions
- How many retired RxNorm descriptions
ATC to RXCUI mart
- How may new RXCUIs
- How many retired RXCUIs
- How many RXCUI->ATC4 mappings changed from previous week?
- Did anything change with ATC1-4 codes from previous week? Would expect this to only change once yearly
Additional Information
We considered adding a task to the mart DAG that saves off an old version of the file and then does a diff and creates a table with this analysis. Or something like this.
Link to historical mart flatfiles: https://drive.google.com/drive/folders/1IMotvyde-TptEOzRlCFouiWV00z4roRW?usp=sharing