yolo2-pytorch icon indicating copy to clipboard operation
yolo2-pytorch copied to clipboard

multi GPUs support?

Open psu1 opened this issue 7 years ago • 2 comments

when run with torch.nn.DataParallel(net).cuda(), there is "AttributeError: 'DataParallel' object has no attribute 'loss'".

After I change loss = net.loss to loss = net.module.loss, there is a error "TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType" at return self.bbox_loss + self.iou_loss + self.cls_loss

Do I need to rewrite the loss function outside "class Darknet19(nn.Module)"?

Any better idea?

psu1 avatar May 02 '18 04:05 psu1

Yes, you need to rewrite the loss function outside the model. DataParallel will duplicate your model to run on multiple gpus, so that you can not access a member variable of it.

longcw avatar May 06 '18 12:05 longcw

Hi, I am trying to run the code with multigpu and I have rewrite the loss function outsde the model. The training looks normal, however, when I try to test it, it gives a lot of negative APs, do you have any idea about the reason, Thanks!

feiyuelankuang avatar Jul 30 '18 21:07 feiyuelankuang