style-transfer-pytorch
style-transfer-pytorch copied to clipboard
Multi-GPU allocation
Hi,
I'm trying to understand device allocation here. I have different GPUs capacities, and the program stops with OOM during backward() even when free space is available.
In the code, I see two critical parts for GPU alloc:
- In class StyleTransfer, you create a device plan to spead the load of the 27ish layers of VGG over GPUs.
if len(self.devices) == 1:
device_plan = {0: self.devices[0]}
elif len(self.devices) == 2:
device_plan = {0: self.devices[0], 5: self.devices[1]}
meaning you send 5 first layers to GPU0 and all other to GPU1.
- In the stylize main loop, you actually send all images and styles to GPU0:
self.image = self.image.to(self.devices[0])
content = content.to(self.devices[0])
so I'm not sure whether the load is spread at all during the backward descent.
How would you see a version where the load is spread evenly depending of each capacity?
- Would you send everything to GPU0 until almost loaded, and then send the remaining data to GPU1
- Or would you keep the ratio of data to each GPU well-balanced during the whole process?
Regards, J