algorithmic-efficiency
algorithmic-efficiency copied to clipboard
Workload Title: Decoder only LM
Workload
Task
Text generation.
Dataset
TBD
Model
TBD Possible candidates include
- [preferred starting point] Nanodo
- NanoGPT
- Meta’s lingua
- Keller Jordan’s modded nanoGPT
Reference Implementation
TBD
ToDo
- [ ] Preliminary Benchmarking for model/dataset
- [ ] Choose Model
- [ ] Choose Dataset
- [ ] Implement data input pipeline
- [ ] Document specific dataset version in workload-specific README
- [ ] Add model in JAX
- [ ] Document model in workload-specific README
- [ ] Add model in PyTorch
- [ ] Target setting for new workload
- [ ] Provide sample submission (and sample tuning search space)
- [ ] Document results of sample submission in workload-specific README (how long did it take to reach the target performance)