Add Wav2Sleep Multi-Modal Sleep Stage Classification Model

Open merebear9 opened this issue 1 month ago • 0 comments

Contributor: Meredith McClain (mmcclan2)
NetID: mmcclan2
Type: Model Implementation

Description

Implementation of wav2sleep, a unified multi-modal approach to sleep stage classification from physiological signals (ECG, PPG, abdominal and thoracic respiratory signals).

Paper

Title: wav2sleep: A Unified Multi-Modal Approach to Sleep Stage Classification from Physiological Signals
Authors: Jonathan F. Carter, Lionel Tarassenko
Link: https://arxiv.org/abs/2411.04644
Year: 2024

Key Features

✅ Operates on variable sets of physiological signals
✅ Handles heterogeneous datasets with different signal availability
✅ Joint training across multiple modalities
✅ Stochastic masking for robust learning
✅ State-of-the-art performance on sleep stage classification

Architecture

Signal Encoders: Separate CNN encoders for each modality (ECG, PPG, ABD, THX)
Epoch Mixer: Transformer encoder for cross-modal fusion using CLS token
Sequence Mixer: Dilated CNN for temporal modeling

Files

pyhealth/models/wav2sleep.py - Complete model implementation (~600 lines)
examples/wav2sleep_example.py - Usage example with dummy data
examples/wav2sleep_README.md - Comprehensive documentation

Documentation

✅ Google-style docstrings for all functions
✅ Type hints throughout
✅ Detailed examples in docstrings
✅ Comprehensive README with:
- Architecture overview
- Usage examples
- Expected performance benchmarks
- Data format specifications

Test Cases

Run the test:

python pyhealth/models/wav2sleep.py

Expected output:

Model creation successful
Forward pass with multiple modalities
Forward pass with single modality
Probability predictions
"Example completed successfully!"

Validation

✅ Tested with dummy data
✅ All components (encoders, mixer, sequence) working
✅ Variable modality input tested
✅ Output shapes validated

Performance (from original paper)

Dataset	Modality	Cohen's κ	Accuracy
SHHS	ECG only	0.739	82.3%
SHHS	ECG+THX	0.779	85.0%
MESA	PPG only	0.742	-
Census	ECG only	0.783	84.8%

Course Project

This contribution is part of CS 598 Deep Learning for Healthcare final project at UIUC (Fall 2025).

References

@article{carter2024wav2sleep,
  title={wav2sleep: A Unified Multi-Modal Approach to Sleep Stage 
         Classification from Physiological Signals},
  author={Carter, Jonathan F. and Tarassenko, Lionel},
  journal={arXiv preprint arXiv:2411.04644},
  year={2024}
}

Dec 08 '25 04:12 merebear9