Ranger21 icon indicating copy to clipboard operation
Ranger21 copied to clipboard

Multi GPU problem

Open zsgj-Xxx opened this issue 3 years ago • 4 comments

Hi I think I'm having a new problem I've compared Ranger with Ranger 21 on a fine-grained dataset, but Ranger 21's results are much worse than Ranger's. I do get exciting results on my own computer, but the results on a multi-card server are poor. Do you know why?

Ranger net_top11

Ranger21 net_top1

zsgj-Xxx avatar Jul 06 '21 13:07 zsgj-Xxx

All the settings are the same except for the optimizer

zsgj-Xxx avatar Jul 06 '21 13:07 zsgj-Xxx

Hi @zsgj-Xxx, Thanks for opening the issue. We have not had a chance to test Ranger21 out on multi-gpu yet, but for sure some aspects of it need to be adjusted in order to run properly (primarily b/c Ranger21 includes an lr scheduler internally), so performance will not be good on multi-gpu vs single gpu. Ranger (original Ranger) should operate w/o issue on multi-gpu and that's reflected in part by the better perf for multi-gpu here. I'm finishing up some testing on a new feature for Ranger21 today and then will try to setup a multi-gpu scenario to get it optimized for handling this. I'll leave this issue open for now to track it. Thanks!

lessw2020 avatar Jul 06 '21 15:07 lessw2020

@lessw2020 Great project, thanks for all of the hard work on it! I'm seeing similar issues. Interestingly, when switching to multiple gpu's (even multiplying the LR by the number of GPU's), the loss doesn't drop any faster. (Steps or wall time). Any ideas on why that would be? Thanks!

ryanstout avatar Jul 14 '21 21:07 ryanstout

Hey @lessw2020, Curious if you had a chance to work on this further? :)

@ryanstout This discussion is probably of interest to you: https://github.com/lessw2020/Ranger21/discussions/4#discussioncomment-826453

rsomani95 avatar Aug 11 '21 06:08 rsomani95