PyHealth icon indicating copy to clipboard operation
PyHealth copied to clipboard

Add Wav2Sleep Multi-Modal Sleep Stage Classification Model

Open merebear9 opened this issue 1 month ago • 0 comments

Contributor: Meredith McClain (mmcclan2)
NetID: mmcclan2
Type: Model Implementation

Description

Implementation of wav2sleep, a unified multi-modal approach to sleep stage classification from physiological signals (ECG, PPG, abdominal and thoracic respiratory signals).

Paper

Title: wav2sleep: A Unified Multi-Modal Approach to Sleep Stage Classification from Physiological Signals
Authors: Jonathan F. Carter, Lionel Tarassenko
Link: https://arxiv.org/abs/2411.04644
Year: 2024

Key Features

  • ✅ Operates on variable sets of physiological signals
  • ✅ Handles heterogeneous datasets with different signal availability
  • ✅ Joint training across multiple modalities
  • ✅ Stochastic masking for robust learning
  • ✅ State-of-the-art performance on sleep stage classification

Architecture

  1. Signal Encoders: Separate CNN encoders for each modality (ECG, PPG, ABD, THX)
  2. Epoch Mixer: Transformer encoder for cross-modal fusion using CLS token
  3. Sequence Mixer: Dilated CNN for temporal modeling

Files

  • pyhealth/models/wav2sleep.py - Complete model implementation (~600 lines)
  • examples/wav2sleep_example.py - Usage example with dummy data
  • examples/wav2sleep_README.md - Comprehensive documentation

Documentation

  • ✅ Google-style docstrings for all functions
  • ✅ Type hints throughout
  • ✅ Detailed examples in docstrings
  • ✅ Comprehensive README with:
    • Architecture overview
    • Usage examples
    • Expected performance benchmarks
    • Data format specifications

Test Cases

Run the test:

python pyhealth/models/wav2sleep.py

Expected output:

  • Model creation successful
  • Forward pass with multiple modalities
  • Forward pass with single modality
  • Probability predictions
  • "Example completed successfully!"

Validation

  • ✅ Tested with dummy data
  • ✅ All components (encoders, mixer, sequence) working
  • ✅ Variable modality input tested
  • ✅ Output shapes validated

Performance (from original paper)

Dataset Modality Cohen's κ Accuracy
SHHS ECG only 0.739 82.3%
SHHS ECG+THX 0.779 85.0%
MESA PPG only 0.742 -
Census ECG only 0.783 84.8%

Course Project

This contribution is part of CS 598 Deep Learning for Healthcare final project at UIUC (Fall 2025).

References

@article{carter2024wav2sleep,
  title={wav2sleep: A Unified Multi-Modal Approach to Sleep Stage 
         Classification from Physiological Signals},
  author={Carter, Jonathan F. and Tarassenko, Lionel},
  journal={arXiv preprint arXiv:2411.04644},
  year={2024}
}

merebear9 avatar Dec 08 '25 04:12 merebear9