sagerx icon indicating copy to clipboard operation
sagerx copied to clipboard

create airflow dag and dbt model for openFDA pregnancy category

Open saywurdson opened this issue 11 months ago • 0 comments

This PR introduces functionality to integrate drug pregnancy category information from the OpenFDA API into SageRx

Resolves #ISSUE NUMBER

Explanation

Airflow DAG for OpenFDA Pregnancy Categories: * This DAG extracts data from the OpenFDA API's drug/label endpoint, specifically searching for records containing teratogenic_effects. * It parses the pregnancy category (A, B, C, D, X) from the text. * It formats associated NDCs to the 11-digit standard. * The extracted data (NDC, RXCUI if available, Pregnancy Category) is saved to a JSON file. * A subsequent task loads this data into the sagerx_lake.openfda_pregnancy_categories table.

dbt Model: * This model joins the openfda_pregnancy_categories data with the int_rxnorm_clinical_products_to_ndcs intermediate model to link pregnancy categories to clinical product RXCUIs via NDCs.

Tests

  1. What testing did you do? Ran in local version of SageRx. Was able to build table in less than 2 minutes. Screenshot 2025-04-25 at 3 48 49 PM

saywurdson avatar Apr 25 '25 19:04 saywurdson