accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Slurm support

Open eliahuhorwitz opened this issue 3 years ago • 6 comments

Hey, I am wondering if you accelerate supports SLURM and if so, how does one run accelerate on slurm in a multi GPU setting? Thanks, Eliahu

eliahuhorwitz avatar Jul 13 '22 13:07 eliahuhorwitz

We don't have support for SLURM right now :-)

sgugger avatar Jul 13 '22 14:07 sgugger

Thanks for the quick answer! I am using a repo that was written with accelerate but using a slurm cluster. Assuming I am successful in launching the code with torch.distributed.launch, should the current code written with the accelerate API support it, or will I need to refactor the code to support PyTorch DDP?

eliahuhorwitz avatar Jul 13 '22 14:07 eliahuhorwitz

It should all work as long as you can launch it!

sgugger avatar Jul 13 '22 14:07 sgugger

Perfect, thanks!

eliahuhorwitz avatar Jul 13 '22 14:07 eliahuhorwitz

Could you please reopen this issue? I think I can work on this, maybe we can have some configuration worth the help of submitit

ZhiyuanChen avatar Aug 01 '22 15:08 ZhiyuanChen

Could you please reopen this issue? I think I can work on this, maybe we can have some configuration worth the help of submitit

Reopened, thanks!

eliahuhorwitz avatar Aug 01 '22 16:08 eliahuhorwitz

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Aug 26 '22 15:08 github-actions[bot]