sagerx
sagerx copied to clipboard
502 error when trying to download FDA UNII data
Problem Statement
We already have a view that uses DailyMed SPL data via RxNorm data files to go from NDC9 -> inactive ingredients (excipients): int_mthspl_products_to_inactive_ingredients
. It is created via the rxnorm
DAG as part of the dbt transform task.
One thing that could be improved is normalizing the UNII display name. FDA has the list of all UNIIs and their preferred display name.
Criteria for Success
- [ ] Create new Airflow DAG to pull FDA UNII data: https://coderx.io/sagerx/source-data/source-data/fda-unique-ingredient-identifiers-uniis (the preferred file to download is the UNII DATA - not UNII List - https://precision.fda.gov/uniisearch/archive/latest/UNII_Data.zip)
- [ ] Any other general cleanup - I notice that this intermediate table is pulling directly from sagerx_lake, which is a dbt no-no
- [ ] Create a mart that pulls in
int_mthspl_products_to_inactive_ingredients
and then joins in the display name from the new FDA UNII DAG via UNII Code - [ ] Push this mart to s3 on a monthly basis around the 10th of the month
Additional Information
https://coderx.io/sagerx/source-data/source-data/fda-unique-ingredient-identifiers-uniis