sagerx icon indicating copy to clipboard operation
sagerx copied to clipboard

Weekly data quality checks for mart updates

Open jrlegrand opened this issue 1 year ago • 3 comments

Depends on #262 and to a lesser extent #266

Problem Statement

Without manual analysis, it's hard to know what changed week to week in our data marts.

Criteria for Success

NDC Description mart

  • How many new NDCs
  • How many retired NDCs
  • How many NDC->RXCUI mappings changed from previous week?
  • How many new FDA descriptions
  • How many changed FDA descriptions
  • How many retired FDA descriptions
  • How many new RxNorm descriptions
  • How many changed RxNorm descriptions
  • How many retired RxNorm descriptions

ATC to RXCUI mart

  • How may new RXCUIs
  • How many retired RXCUIs
  • How many RXCUI->ATC4 mappings changed from previous week?
  • Did anything change with ATC1-4 codes from previous week? Would expect this to only change once yearly

Additional Information

We considered adding a task to the mart DAG that saves off an old version of the file and then does a diff and creates a table with this analysis. Or something like this.

Link to historical mart flatfiles: https://drive.google.com/drive/folders/1IMotvyde-TptEOzRlCFouiWV00z4roRW?usp=sharing

jrlegrand avatar Mar 09 '24 19:03 jrlegrand

@leemlb06pmi / @Komal77rao - I can get you the past couple of weeks of data mart flatfiles if that would be helpful for this issue - let me know when you're ready.

jrlegrand avatar Mar 20 '24 13:03 jrlegrand

@leemlb06pmi / @Komal77rao - I can get you the past couple of weeks of data mart flatfiles if that would be helpful for this issue - let me know when you're ready.

ok that's great - baseline files will definitely be useful when we get to this point. thanks!

leemlb06pmi avatar Mar 20 '24 14:03 leemlb06pmi

https://drive.google.com/drive/folders/1IMotvyde-TptEOzRlCFouiWV00z4roRW?usp=sharing

jrlegrand avatar Mar 20 '24 15:03 jrlegrand