deep-blind-watermark-removal
deep-blind-watermark-removal copied to clipboard
Model unable to run on multiple GPUs
I have multiple GPUs and I would Like to train the model on them for faster training. I see that you have already implemented MultiGPU training by using nn.DataParallel. There were some bugs in VX.py which were solved after i converted "self.model" to "self.model.module".
Yet after ensuring that i am using "CUDA_VISIBLE_DEVICES=0,1" i still see only GPU0 's memory to be filled and not GPU1's.
The model gives cuda out of memory if i try to use a input size >=512 with a batchsize of 12 or even 8.
Any idea why is it only consuming 1 of the 2 GPUs??
Thanks
@vinthony Any updates on this?
Currently, our method might work on multiple GPUs. I am sorry i can not test your question because I do not have multiple GPU envs currently.
Basically, we have implemented some code here to support multiple GPUs: https://github.com/vinthony/deep-blind-watermark-removal/blob/d238edfd931abe2ddfbef5ca1fbef3c551969f47/scripts/machines/BasicMachine.py#L68
you may refer to some detail here for debugging: https://stackoverflow.com/questions/54216920/how-to-use-multiple-gpus-in-pytorch