fsgan
fsgan copied to clipboard
RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1024, 32, 32]
Running the default inference parameters on the v1
branch appears to lead to a dimension error in the UnetUp
class here: https://github.com/YuvalNirkin/fsgan/blob/v1/models/simple_unet.py#L135
python face_swap_video2video.py ../docs/examples/shinzo_abe.mp4 -t ../docs/examples/conan_obrien.mp4 -o output
results in:
RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1024, 32, 32]
I had made some minor changes to use CPU only while getting this to work on my M1 before moving to a host w/ a Nvidia GPU, nothing that would change dimensions here. Still, since I don't see this issue posted already I assume it's something local to my machine/arguments, appreciate any pointers here.
same issue... don't know why...
What is your PyTorch version?
What is your PyTorch version?
@YuvalNirkin my version is "1.11.0+cu113" I'm running the code in colab
with the sample source and target, I'm getting a error as following
/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs1, inputs2) 133 134 def forward(self, inputs1, inputs2): --> 135 outputs2 = self.up(inputs2) 136 outputs2 = self.conv1d(outputs2,) 137 offset = outputs2.size()[2] - inputs1.size()[2] ... RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [24, 1024, 32, 32]
My PyTorch version is 1.11.0
@prodigy-sub -- you're getting that dimension error (different first dimension) with the same input/target videos I posted originally?
Yes. Correct. I'm using Conan and Abe video It fails at the segmentation step
@YuvalNirkin
the following is the full error message, I think there is some problem with the segmentation model. but Since I'm trying to used the pre-trained weight that you guys offered, I think it is not an option to change the layer of the model... do you have any solutions?
100%|██████████| 600/600 [03:35<00:00, 2.78frames/s] => Extracting sequences from detections in video: "source.mp4"... 100%|██████████| 601/601 [00:00<00:00, 11066.43it/s] => Cropping video sequences from video: "source.mp4"... 100%|██████████| 600/600 [00:04<00:00, 148.46it/s] => Computing face poses for video: "source_seq00.mp4"... 100%|██████████| 5/5 [00:03<00:00, 1.53batches/s] => Computing face landmarks for video: "source_seq00.mp4"... 100%|██████████| 10/10 [00:03<00:00, 2.72batches/s] => Computing face segmentation for video: "source_seq00.mp4"... 0%| | 0/25 [00:00<?, ?batches/s]
RuntimeError Traceback (most recent call last)
9 frames /content/projects/fsgan/inference/swap.py in call(self, source_path, target_path, output_path, select_source, select_target, finetune) 237 238 # Cache input --> 239 source_cache_dir, source_seq_file_path, _ = self.cache(source_path) 240 target_cache_dir, target_seq_file_path, _ = self.cache(target_path) 241
/content/projects/fsgan/preprocess/preprocess_video.py in cache(self, input_path, output_dir) 478 479 # Cache segmentation --> 480 self.process_segmentation(input_path, output_dir, seq_file_path) 481 482 return output_dir, seq_file_path, pose_file_path if self.cache_pose and is_vid else None
/content/projects/fsgan/preprocess/preprocess_video.py in process_segmentation(self, input_path, output_dir, seq_file_path)
382
383 # Compute segmentation
--> 384 raw_segmentation = self.S(frame)
385 segmentation = torch.cat((prev_segmentation, raw_segmentation), dim=0)
386 if prev_segmentation is not None else raw_segmentation
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(*input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []
/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs) 69 70 center = self.center(maxpool4) ---> 71 up4 = self.up_concat4(conv4, center) 72 up3 = self.up_concat3(conv3, up4) 73 up2 = self.up_concat2(conv2, up3)
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(*input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []
/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs1, inputs2) 133 def forward(self, inputs1, inputs2): 134 outputs2 = self.up(inputs2) --> 135 outputs2 = self.conv1d(outputs2,) 136 offset = outputs2.size()[2] - inputs1.size()[2] 137 padding = 2 * [offset // 2, offset // 2]
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(*input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in forward(self, input) 300 301 def forward(self, input: Tensor) -> Tensor: --> 302 return self._conv_forward(input, self.weight, self.bias) 303 304
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias) 297 _single(0), self.dilation, self.groups) 298 return F.conv1d(input, weight, bias, self.stride, --> 299 self.padding, self.dilation, self.groups) 300 301 def forward(self, input: Tensor) -> Tensor:
RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [24, 1024, 32, 32]
SOLVED partially: I had the same issue [55 1024 32 32]... for me at least it was pytorch incompatibility doesn't like pytorch 1.11.0 nor 1.0.1...
I commented out the 5 lines regarding install dependencies, Anna, conda, pip3... etc and replaced it with what's below....
Install: PyTorch (we assume 1.5.1 but VISSL works with all PyTorch versions >=1.4)
!pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
install opencv
!pip install opencv-python
install apex by checking system settings: cuda version, pytorch version, python version
import sys import torch version_str="".join([ f"py3{sys.version_info.minor}_cu", torch.version.cuda.replace(".",""), f"_pyt{torch.version[0:5:2]}" ]) print(version_str)
install apex (pre-compiled with optimizer C++ extensions and CUDA kernels)
!pip install apex -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/{version_str}/download.html
install VISSL
!pip install vissl
NEW ISSUE.... Gpu RAM insufficient! On target.mp4 segmentation
You can try this issue's solution: #161
Thank you. This is issue should be fixed now. Follow the new installation instructions.