DeepSpeed
DeepSpeed copied to clipboard
[Roadmap] DeepSpeed Roadmap Q1 2026
This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the DeepSpeed Slack.
New Accelerator Support
- [ ] DeepSpeed support on TPU
Emergent Model Architectures
- [ ] SuperOffloading for Mixture-of-Expert (MoE) Training
Reinforcement Learning
- [ ] DeepSpeed backend integration as the training engine for verl
New Optimizer Support
- [ ] Muon Optimizer Support for ZeRO3
Does "DeepSpeed backend integration as the training engine for verl" means to be the default training engine for verl?
Does "DeepSpeed backend integration as the training engine for verl" means to be the default training engine for verl?
No, I believe this means integrating DeepSpeed as one of the backend options alongside FSDP and Megatron.