pytorch-deep-image-matting
pytorch-deep-image-matting copied to clipboard
Problem in modifying the code to multi GPU process
Hi, thank for your awesome work. I want to modify the code to fit the multi GPU process, and I modify your main code below:
if args.cuda:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = nn.DataParallel(model)
model.to(device)
But, I got error:
Traceback (most recent call last):
File "core/train.py", line 361, in <module>
main()
File "core/train.py", line 353, in main
train(args, model, optimizer, train_loader, epoch)
File "core/train.py", line 200, in train
pred_mattes, pred_alpha = model(input_img)
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 124, in forward
return self.gather(outputs, self.output_device)
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 136, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 67, in gather
return gather_map(outputs)
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 62, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/chaofan/lib/anaconda2/envs/python36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 62, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: zip argument #1 must support iteration
I have no idea about this problem, do you have some suggestions? Thank you!
@hellozgm Same problem! Have you found the solution?
The multi GPUs is not supported currently. If you want to accelerate the training process, please use the hyperparameter batchSize while training. But the result shows that batchSize=1 can perform better.