PyHealth
PyHealth copied to clipboard
Add OhioT1DM Dataset for Blood Glucose Level Prediction
Contributor Info
- Name:Chengcen Zhou
- Email: [email protected]
Type of Contribution
- New dataset (OhioT1DM: Ohio Type 1 Diabetes Mellitus)
- New tasks (blood glucose prediction, hypoglycemia detection, glucose range classification)
- New unit tests
Relationship to Previous PR
This PR builds on my earlier contribution in PR #682 (WESAD Dataset for Wearable Stress Detection). Together, the WESAD and OhioT1DM datasets are used in my DLH final project replicating Simulation of Health Time Series with Nonstationarity (Toye, Gomez, & Kleinberg, 2024).
| PR | Dataset | Task | Domain |
|---|---|---|---|
| #682 | WESAD | Stress Detection | Mental Health / Wearables |
| This PR | OhioT1DM | Blood Glucose Prediction | Diabetes Management |
What's OhioT1DM Dataset
This PR integrates the OhioT1DM dataset [Marling & Bunescu, 2020] into PyHealth for blood glucose level prediction research. It adds:
- An
OhioT1DMDatasetclass for loading continuous glucose monitoring (CGM) data from 12 subjects with Type 1 Diabetes - Task functions for blood glucose prediction (30min and 60min horizons), hypoglycemia detection, hyperglycemia detection, and glucose range classification
- Unit tests with synthetic XML data generation
The implementation follows the contributing guidelines (PEP8, Google-style docstrings, and documented function signatures).
Files to Review
Modified:
pyhealth/datasets/__init__.py: registerOhioT1DMDatasetpyhealth/tasks/__init__.py: registerblood_glucose_prediction_ohiot1dm_fn
New:
pyhealth/datasets/ohiot1dm.py: OhioT1DMDataset class implementationpyhealth/tasks/blood_glucose_prediction_ohiot1dm.py: Task functions for glucose predictiontests/test_ohiot1dm.py: Unit tests for dataset and tasks
Dataset Info
| Item | Details |
|---|---|
| Subjects | 12 (2018 cohort: 6, 2020 cohort: 6) |
| Duration | 8 weeks per subject |
| CGM Readings | Every 5 minutes |
| Data Includes | Glucose, insulin (basal/bolus), meals, exercise, sleep, physiological sensors |
| Source | UCI / Ohio University |
Reference
- Paper Title: Marling, C., & Bunescu, R. "The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020", CEUR Workshop Proceedings, 2020
- Paper Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC7881904/
- Dataset Link: https://www.kaggle.com/datasets/ryanmouton/ohiot1dm
How to Use
from pyhealth.datasets import OhioT1DMDataset
from pyhealth.tasks import blood_glucose_prediction_ohiot1dm_fn
# Load dataset
dataset = OhioT1DMDataset(root="/path/to/OhioT1DM/")
# Apply task (30-minute prediction horizon)
dataset = dataset.set_task(blood_glucose_prediction_ohiot1dm_fn)
# Access samples
print(f"Total samples: {len(dataset.samples)}")