stylegan2-pytorch
stylegan2-pytorch copied to clipboard
train with augment
do I train the model with python -m torch.distributed.launch --nproc_per_node=N_GPU --master_port=PORT train.py --batch BATCH_SIZE LMDB_PATH --augment
Yes. python -m torch.distributed.launch --nproc_per_node=N_GPU --master_port=PORT train.py --batch BATCH_SIZE --augment LMDB_PATH
Hi @rosinality, after turning on --augment, I got the following error:
File "/home/zhule.zhl/miniconda3/envs/py37_torch160/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/zhule.zhl/miniconda3/envs/py37_torch160/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/mnt3/zhule.zhl/gitWorks/stylegan2-pytorch/op/fused_act.py", line 101, in forward return fused_leaky_relu(input, self.bias, self.negative_slope, self.scale) File "/mnt3/zhule.zhl/gitWorks/stylegan2-pytorch/op/fused_act.py", line 119, in fused_leaky_relu return FusedLeakyReLUFunction.apply(input, bias, negative_slope, scale) File "/mnt3/zhule.zhl/gitWorks/stylegan2-pytorch/op/fused_act.py", line 66, in forward out = fused.fused_bias_act(input, bias, empty, 3, 0, negative_slope, scale) RuntimeError: input must be contiguous
Do you know why it happened? I couldn't fix it.
@tearscoco I got the same error when training with augmentation
@tearscoco @ElektrischesSchaf It will be fixed with bb459e0.
@rosinality thank you, now it works perfectly
I also found that when training with --augment, this error appears
File "/workspace/gan-pytorch/stylegan2-pytorch-rosinality/non_leaking.py", line 316, in get_padding
pad = pad.max(torch.tensor([0, 0] * 2, device=device))
RuntimeError: Expected object of scalar type float but got scalar type long int for argument 'other'
whereas training without --augment does not have this error. I'm not sure it's because I'm feeding a custom dataset into the model or not.
My solution is adding dtype=torch.float32
in function get_padding()
from non_leaking.py
pad = pad.max(torch.tensor([0, 0] * 2, device=device, dtype=torch.float32))
pad = pad.min(torch.tensor([width - 1, height - 1] * 2, device=device, dtype=torch.float32))
@ElektrischesSchaf You can fix this by changing into pad = pad.max(torch.tensor([0.0, 0.0] * 2, device=device))