PyHealth
PyHealth copied to clipboard
Add Wav2Sleep Multi-Modal Sleep Stage Classification Model
Contributor: Meredith McClain (mmcclan2)
NetID: mmcclan2
Type: Model Implementation
Description
Implementation of wav2sleep, a unified multi-modal approach to sleep stage classification from physiological signals (ECG, PPG, abdominal and thoracic respiratory signals).
Paper
Title: wav2sleep: A Unified Multi-Modal Approach to Sleep Stage Classification from Physiological Signals
Authors: Jonathan F. Carter, Lionel Tarassenko
Link: https://arxiv.org/abs/2411.04644
Year: 2024
Key Features
- ✅ Operates on variable sets of physiological signals
- ✅ Handles heterogeneous datasets with different signal availability
- ✅ Joint training across multiple modalities
- ✅ Stochastic masking for robust learning
- ✅ State-of-the-art performance on sleep stage classification
Architecture
- Signal Encoders: Separate CNN encoders for each modality (ECG, PPG, ABD, THX)
- Epoch Mixer: Transformer encoder for cross-modal fusion using CLS token
- Sequence Mixer: Dilated CNN for temporal modeling
Files
pyhealth/models/wav2sleep.py- Complete model implementation (~600 lines)examples/wav2sleep_example.py- Usage example with dummy dataexamples/wav2sleep_README.md- Comprehensive documentation
Documentation
- ✅ Google-style docstrings for all functions
- ✅ Type hints throughout
- ✅ Detailed examples in docstrings
- ✅ Comprehensive README with:
- Architecture overview
- Usage examples
- Expected performance benchmarks
- Data format specifications
Test Cases
Run the test:
python pyhealth/models/wav2sleep.py
Expected output:
- Model creation successful
- Forward pass with multiple modalities
- Forward pass with single modality
- Probability predictions
- "Example completed successfully!"
Validation
- ✅ Tested with dummy data
- ✅ All components (encoders, mixer, sequence) working
- ✅ Variable modality input tested
- ✅ Output shapes validated
Performance (from original paper)
| Dataset | Modality | Cohen's κ | Accuracy |
|---|---|---|---|
| SHHS | ECG only | 0.739 | 82.3% |
| SHHS | ECG+THX | 0.779 | 85.0% |
| MESA | PPG only | 0.742 | - |
| Census | ECG only | 0.783 | 84.8% |
Course Project
This contribution is part of CS 598 Deep Learning for Healthcare final project at UIUC (Fall 2025).
References
@article{carter2024wav2sleep,
title={wav2sleep: A Unified Multi-Modal Approach to Sleep Stage
Classification from Physiological Signals},
author={Carter, Jonathan F. and Tarassenko, Lionel},
journal={arXiv preprint arXiv:2411.04644},
year={2024}
}