Zach Mueller
Zach Mueller
Yes, ddp does we already have documented this + a fix is being put in (I also have an article talking about this more, tl;dr you can choose a slower...
cc @pacman100
Hello hello! I think I know enough to answer yes/no with mild confidence now 🫡 Ty for your patience 🙇 IIUC, the key areas of the diff are: 1. During...
@microsoft-github-policy-service agree company="Hugging Face"
Sorry for the extraneous pushes while I was figuring something out. Good to go now :)
You can see our new accelerate benchmarking scripts here: https://github.com/huggingface/accelerate/tree/muellerzr-msamp-ds-fsdp/benchmarks/fp8/ms_amp
@tocean @wkcn any particular issues with this? :) (Ideally it'd be great to include this in the next accelerate release on the 1st :) )
Ack okay, I suppose we'll have to wait for @tocean /@abuccts /@guoshzhao to take a look. Thanks for the flag 🤗
Yep exactly. We can add a deprecation warning here but we've never advertised it in that way at that import level so I think we should be fine.
Sure we can absolutely. If you'd like to expand our checkpointing example here in accelerate implementing that, we can look at upstreaming it further 🤗