Zach Mueller

Results 368 comments of Zach Mueller

Oh boy, okay. Well, that was a thought 😢 CC @stas00 if I misunderstood anything? (See the nccl issue)

How are you launching the python script in your bash?

You're launching with python. You should use either `accelerate launch` or `torch.distributed.run` otherwise you'll get model parallel (which isn't what you're aiming for)

Accelerate should handle most of this now, cc @SunMarc if you want to give this a try!

@janboeye yes PyTorch does not have mixed precision support on MPS at this time

Overall I can see this being a pretty nice idea. Made some nits to improve. cc @SunMarc for your thoughts as well :)

At this time we do not support multiple models with `deepspeed`, please see: https://github.com/huggingface/accelerate/issues/2496