algorithmic-efficiency LM1B: Jax

LM1B: Jax

Open dvsaisurya opened this issue 1 year ago • 0 comments

LM1B

Transformer language model training, resembling https://github.com/google/flax/tree/main/examples/lm1b .

LM1B dataset. https://www.tensorflow.org/datasets/catalog/lm1b

decoder-only Transformer, encoder+decoder based Transformer.

https://github.com/google/flax/tree/main/examples/lm1b

[ ] Implement data input pipeline
- [ ] Document specific dataset version in workload-specific README
[ ] Add model
- [ ] Document model in workload-specific README
[ ] Provide sample submission (and sample tuning search space)
- [ ] Document results of sample submission in workload-specific README (how long did it take to reach the target performance)

Jan 24 '24 15:01 dvsaisurya