darknet
darknet copied to clipboard
Training with multiple GPUs is not faster than 1 GPU???
I follow the guide to train my dataset with multiple GPUs, I saw speed of 2 cases is same. I use the same config
batch=64
subdivisions=32 # 16 OOM
width=512
height=512
...
max_batches=10000
I check GPUs usage and almost GPUs ared used.
@AlexeyAB
Could you help me?
I use same batch
, max_batches
and subdivision
for 1 GPU and multiple GPUs, but training time is same.
I read this issuse https://github.com/AlexeyAB/darknet/issues/1165 and @AlexeyAB you also commented to this issue.
As my understanding if we use 4 GPUs, we need to reduce max_batches
4 times (compared to the case with 1 GPU) to get better speed (because with more GPUs, more images will be processed in 1 iteration) and change lr
, burnin
if needed as follow https://github.com/AlexeyAB/darknet/tree/64efa721ede91cd8ccc18257f98eeba43b73a6af#how-to-train-with-multi-gpu. Is that right?
AlexeyAB is no longer working on Darknet/YOLO.
You should see the FAQ. It has some information on speeding up training: https://www.ccoderun.ca/programming/darknet_faq/#time_to_train
Are you sure you are using the correct command? Post the command you are using to train on multiple GPUs.
@stephanecharette Thanks for respone.
Here is command that I used for multiple GPUs training (after training with 1 GPU for some iterations and get weights)
./darknet detector train data/obj.data yolov4-custom.cfg backup/yolov4-custom_last.weights -gpus 0,1 -dont_show