PyHealth
PyHealth copied to clipboard
Add MIMIC CXR Reports Dataset Class
This contribution:
- Adds a new dataset: MIMIC-CXR Database 2.1.0
- Implements a new dataset class compliant with PyHealth’s BaseDataset
- Adds a Pydantic-validated YAML config
- Extracts PATIENTID/STUDYID/FINDINGS/IMPRESSION sections automatically
- Adds no breaking changes to PyHealth’s existing datasets
- Keeps data access external (MIMIC files must be obtained from PhysioNet through Credentialed Access)
This PR is submitted by the following group of UIUC students :
- Lokanath Das (ldas2)
- Jared Backofen (jaredb3)
- Jacob Ray Fuehne (jfuehne2)
Below Files are introduced as part of the PR :
pyhealth/ ├── datasets/ ├── mimic_cxr_reports.py # Dataset implementation ├── configs/ │ └── mimic_cxr_reports.yaml # Dataset configuration (Pydantic validated) ├── init.py # Updated the Dataset class relative import ├── tests/ ├── test_mimic_cxr_reports.py # Test script for dataset loader ├── docs/ ├── README_mimic_cxr_reports.md # Documentation