EasyFL icon indicating copy to clipboard operation
EasyFL copied to clipboard

Bug fixing: GPU runtime

Open J-BING opened this issue 3 years ago • 1 comments

Hi, Thanks for the good work and open source of EasyFL. I am interested in your recent work discussing the SSL + FL.

When I run the FedSSL, I found a bug regarding the GPU runtime. Specifically, when set up "--gpu 1", it will never use the current GPU. It is caused by the if-condition if args.gpu > 1 in main.py, which will make the self.conf.gpu==0 all the time.

What I revise to make it work:

  1. add else term for if args.gpu > 1: config["gpu"] = args.gpu
  2. change self.conf.device = "cuda" when self.conf.gpu==1 in coordinator.py

BTW, slurm is used for multiple GPU runtime. For those envs not using slurm, I recommend adding some warnings in the code or tutorial to make the multi-GPU configuration clear.

Hope it will help. Thanks for your work.

J-BING avatar Jul 12 '22 20:07 J-BING

Hi @J-BING , these feedbacks are really useful. We will incorporate and revise related issues. It'll be even better if you have time to create a pull request for the changes in coordinator.py. I'll merge them into the repo. Thank you.

weimingwill avatar Jul 13 '22 05:07 weimingwill