PTBXL ECG Dataset Contribution
Samir Gray (sbgray2) Paper Title: Data Augmentation for Electrocardiograms Link to Paper: https://arxiv.org/abs/2204.04360 Link to Dataset: https://physionet.org/content/ptb-xl/1.0.1/ Type of contribution: Dataset — implementation of the PTB-XL ECG dataset as a PyHealth-compatible loader Task — implementation of binary ECG classification task as a PyHealth-compatible task High-level description: This PR introduces a new dataset class PTBXL for the publicly available PTB-XL 12-lead electrocardiogram dataset. It uses PyHealth's YAML configuration system to manage metadata and supports loading waveform segments via wfdb. Signals are downsampled or padded to fixed length, and labels are derived from SCP diagnostic codes using a customizable binary classifier (default: NORM vs abnormal). It also implements a corresponding task class for binary classification based on SCP diagnostic codes (NORM vs abnormal).
Files to review/test:
pyhealth/datasets/ptbxl.py — defines the PTBXL dataset class
pyhealth/datasets/configs/ptbxl.yaml — YAML config for loading ptbxl_database.csv
pyhealth/unittests/test_datasets/test_ptbxl.py— unit test for basic dataset loading and sampling
pyhealth/datasets/__init__.py — includes ptbxl in __init__.py so that the unit test case is useable
pyhealth/unittests/test_binary_ECG_classification.py — unit test for preprocessing and __call__
pyhealth/tasks/binary_ECG_classification.py — task for binary classification on ECG signals
pyhealth/tasks/__init__.py — includes binary_ECG_classification in __init__.py so that the unit test case is useable
I would! Where can I find the requirements for that? The links on the google doc are all broken.
Oh, thanks for the heads-up! Let me see if I can't fix those. Check whenever you have time! I believe I've updated most of the links, especially the ones regarding the tasks.
Hi John,
I've updated my PR to now include the task and the associated unit test! Sorry for the delay, please let me know if you need any changes made.
Closing this PR as it lacks proper labeling. Please add appropriate labels and reopen if needed.