PyHealth icon indicating copy to clipboard operation
PyHealth copied to clipboard

Add Labrador Model (ICML 2024) – PyTorch Port With Tests

Open Rad099 opened this issue 1 month ago • 0 comments

This PR contributes a new ML model to PyHealth: Labrador, a PyTorch reimplementation of the ICML 2024 model from:

Bellamy, Kumar, Wang, & Beam (2024). "Labrador: Exploring the Limits of Masked Language Modeling for Laboratory Data." ICML 2024.

Model Summary

  • Transformer-based encoder for laboratory data
  • Dual embedding pathways:
  • categorical lab codes
  • continuous lab values (with mask/null handling)
  • Stacked Transformer encoder blocks
  • Masked mean pooling over the sequence
  • Optional static feature integration
  • Downstream classifier head supporting: binary, multiclass, multilabel, regression
  • Fully compatible with PyHealth BaseModel and Trainer APIs

Included in this PR

  • pyhealth/models/labrador.py — full PyTorch implementation of the model
  • Integration in pyhealth/models/init.py
  • Unit tests (tests/core/test_labrador.py) verifying:
  •   - initialization
    
  •   - forward pass
    
  •   - backward pass
    
  •   - custom hyperparameters   
    
  • Complete docstring with citation and architectural explanation

Notes

  • This PR focuses on the supervised downstream variant (classification/regression).
  • MLM pretraining heads from the original TensorFlow implementation are not included.

Rad099 avatar Dec 07 '25 22:12 Rad099