PyHealth icon indicating copy to clipboard operation
PyHealth copied to clipboard

Add SOZStimulationDataset for Seizure Onset Zone Localization from SPES EEG

Open vineethagrl opened this issue 1 month ago • 0 comments

Contributors: Vineetha Gurrala, vgurr4 Clay Douglas, ckd6

Type: Dataset Contribution

Description: This PR introduces SOZStimulationDataset, a PyTorch-compatible dataset class for single-pulse electrical stimulation (SPES) EEG recordings used for seizure onset zone (SOZ) localization, replicating the dataset structure used in our reproducibility project:

Reproducing: SAMIL: Spatial Attention-based Multi-modal Integration for Localizing Seizure Onset Zones from Single-pulse Electrical Stimulation This dataset class does not include raw EEG data due to size/IRB constraints, but instead provides a standardized loading interface compatible with the preprocessed numpy outputs typically generated in SPES-based epilepsy research pipelines.

The expected input format is: root/ ├── train_X_stim.npy # [N_train, C, T] ├── train_y.npy # [N_train] ├── val_X_stim.npy ├── val_y.npy ├── test_X_stim.npy └── test_y.npy

Each sample returns: {"X_stim": Tensor[C, T]}, label

Where label ∈ {0,1} corresponds to SOZ vs non-SOZ stimulation sites.

The PR also includes: Example usage script in examples/soz_stimulation_example.py Minimal test case validating dataset load behavior Documentation describing expected data format and reference to the original study

Usage: from pyhealth.datasets import SOZStimulationDataset ds = SOZStimulationDataset(root="data/soz_spes_processed", split="train") x, y = ds[0] print(x["X_stim"].shape, y)

vineethagrl avatar Dec 07 '25 05:12 vineethagrl