algorithmic-efficiency
algorithmic-efficiency copied to clipboard
LM1B: Jax
Workload
LM1B
Task
Transformer language model training, resembling https://github.com/google/flax/tree/main/examples/lm1b .
Dataset
LM1B dataset. https://www.tensorflow.org/datasets/catalog/lm1b
Model
decoder-only Transformer, encoder+decoder based Transformer.
Reference Implementation
https://github.com/google/flax/tree/main/examples/lm1b
ToDo
- [ ] Implement data input pipeline
- [ ] Document specific dataset version in workload-specific README
- [ ] Add model
- [ ] Document model in workload-specific README
- [ ] Provide sample submission (and sample tuning search space)
- [ ] Document results of sample submission in workload-specific README (how long did it take to reach the target performance)