PyHealth
PyHealth copied to clipboard
Add Labrador Model (ICML 2024) – PyTorch Port With Tests
This PR contributes a new ML model to PyHealth: Labrador, a PyTorch reimplementation of the ICML 2024 model from:
Bellamy, Kumar, Wang, & Beam (2024). "Labrador: Exploring the Limits of Masked Language Modeling for Laboratory Data." ICML 2024.
Model Summary
- Transformer-based encoder for laboratory data
- Dual embedding pathways:
- categorical lab codes
- continuous lab values (with mask/null handling)
- Stacked Transformer encoder blocks
- Masked mean pooling over the sequence
- Optional static feature integration
- Downstream classifier head supporting: binary, multiclass, multilabel, regression
- Fully compatible with PyHealth BaseModel and Trainer APIs
Included in this PR
- pyhealth/models/labrador.py — full PyTorch implementation of the model
- Integration in pyhealth/models/init.py
- Unit tests (tests/core/test_labrador.py) verifying:
-
- initialization -
- forward pass -
- backward pass -
- custom hyperparameters - Complete docstring with citation and architectural explanation
Notes
- This PR focuses on the supervised downstream variant (classification/regression).
- MLM pretraining heads from the original TensorFlow implementation are not included.