PyHealth
PyHealth copied to clipboard
add-tcga-paad-dataset-sadiq5
Add TCGA-PAAD Dataset (Pancreatic Adenocarcinoma)
- Introduces TCGA PAAD Dataset for TCGA-PAAD with standardized mutations and clinical tables.
- Config: tcga_paad.yaml defines two tables: mutations (hugo_symbol, variant_classification, variant_type, hgvsc, hgvsp, tumor_sample_barcode) and clinical (age_at_diagnosis, vital_status, days_to_death, tumor_stage).
- Usage: from pyhealth.datasets import TCGAPAADDataset dataset = TCGAPAADDataset(root="path/to/TCGA-PAAD") samples = dataset.set_task() with default CancerSurvivalPrediction()
Testing
python -m pytest tests/core/test_tcga_paad.py -q
PS C:\Users\musta\OneDrive\UIUC MS CS 2024\CS 598 - Deep Learning for Healthcare\Project\PyHealth> python -m pytest tests/core/test_tcga_paad.py -q ....... [100%] 7 passed in 5.29s