DeepSpeed
DeepSpeed copied to clipboard
[REQUEST] torch.compile + DeepSpeed
I am looking into running DeepSpeed with torch.compile and facing multiple issues with respect to tracing the hooks.
DeepSpeed Stage 2 backward hook tracing with Compiled Autograd
- Accessing param.grad directly fails while tracing the model with AOTAutograd as the param.grad is not filled while tracing. This is not a recommended way of accessing the grad field with compiled autograd.
- Multiple parts of the implementation themselves lead to graph breaks like calling id(param)
With DeepSpeed Stage3 torch.compile itself fails while tracing the forward hook. Similar issue is present with tracing model parallelism.
There is an effort from Pytorch to make FSDP traceable. Is it possible to share if there is any effort to enable DeepSpeed with torch.compile or the list of features which are supported currently with torch.compile?
Hope to support as soon as possible. It is very useful for LLM.
Hi @sssiva81, @BobLiu20,
I submitted a draft PR #4878 to enable torch.compile. Please feel free to try. You can also check an example on Megatron-DeepSeed.
Thanks @tohtana . Is there any similar effort going on to enable Pipeline Parallelism with torch compile? Currently it fails if we torch.compile the pipeline module because of this assertion assert isinstance(model, deepspeed.PipelineEngine)
Hi @sssiva81, the assertion was fixed by #5197. Sorry for the delay.