CrossStagePartialNetworks icon indicating copy to clipboard operation
CrossStagePartialNetworks copied to clipboard

Need help in setting hyper-parameters

Open Rajasekhar06 opened this issue 4 years ago • 6 comments

@WongKinYiu I have been trying to set right hyper parameters for yolov3-spp but for complete open-images dataset after 300-400 iterations server restarts I have previously trained with 3 of the classes among 601 classes but then I used single GPU parameters for multi GPU training when training for threee classes and dataset size is also small then like 1100 images or So. But now training on whole dataset with Multi GPU parameters causing system reboot, BTW how do you calculate the parameters for multi-GPU you have already replied to me in previous issues on @AlexeyAB repo at the core how to set burn-in,learning rate, decay in cfg file.as narrated by alexy is causing issue so I changed almost all the parameters to single GPU config except burn-in even then problem persists Screenshot from 2020-02-07 15-49-29 Screenshot from 2020-02-07 15-49-07 For the above hardware here is the link to config I'm using Please help me out Thanks

Rajasekhar06 avatar Feb 07 '20 10:02 Rajasekhar06

But now training on whole dataset with Multi GPU parameters causing system reboot,

This is a hardware issue: power insufficient or hardware bug in GPU.

AlexeyAB avatar Feb 07 '20 12:02 AlexeyAB

Should I upgrade my PSU.. to meet the needs or reducing the image resolution to smaller size might also work but decrease in accuracy right

RajashekarY avatar Feb 07 '20 13:02 RajashekarY

try to train by using 2-3 GPUs instead of 4.

AlexeyAB avatar Feb 07 '20 14:02 AlexeyAB

@RajashekarY what is your PSU?

LukeAI avatar Feb 07 '20 14:02 LukeAI

Actually I don't know @LukeAI I remotely use this system I need to ask the owner😛 But

try to train by using 2-3 GPUs instead of 4.

Might get the job done

RajashekarY avatar Feb 07 '20 15:02 RajashekarY

The peak of Titan RTX is about 390W, so you need at least 1500W and better 2000W power supply. However, in my experiments, it usually cause by protection of mainboard due to a single PSU is not stable enough for multiple GPUs. In our case, we use dual PSUs for single mainboard.

WongKinYiu avatar Feb 08 '20 00:02 WongKinYiu