RAD-NeRF
RAD-NeRF copied to clipboard
Given input size: (192x2x2). Calculated output size: (192x0x0). Output size is too small
Hello, thanks for your nice work. When I run the code on my video, there is a problem appearing.
~/MyCode/RAD_NeRF$ python main.py data/person_video1_25fps_512x512/ --workspace person_video1/ -O --iters 250000 --finetune_lips
Namespace(H=450, O=True, W=450, amb_dim=2, asr=False, asr_model='cpierse/wav2vec2-large-xlsr-53-esperanto', asr_play=False, asr_save_feats=False, asr_wav='', att=2, aud='', bg_img='', bound=1, ckpt='latest', color_space='srgb', cuda_ray=True, data_range=[0, -1], density_thresh=10, density_thresh_torso=0.01, dt_gamma=0.00390625, emb=False, exp_eye=True, fbg=False, finetune_lips=True, fix_eye=-1, fovy=21.24, fp16=True, fps=50, gui=False, head_ckpt='', ind_dim=4, ind_dim_torso=8, ind_num=10000, iters=250000, l=10, lambda_amb=0.1, lr=0.005, lr_net=0.0005, m=50, max_ray_batch=4096, max_spp=1, max_steps=16, min_near=0.05, num_rays=65536, num_steps=16, offset=[0, 0, 0], part=False, part2=False, patch_size=1, path='data/person_video1_25fps_512x512/', preload=0, r=10, radius=3.35, scale=4, seed=0, smooth_eye=False, smooth_lips=False, smooth_path=False, smooth_path_window=7, test=False, test_train=False, torso=False, torso_shrink=0.8, train_camera=False, update_extra_interval=1000000000.0, upsample_steps=0, workspace='person_video1/')
[INFO] load 783 train frames.
[INFO] load aud_features: torch.Size([861, 44, 16])
Loading train data: 100%|███████████████████████████████████████████████| 783/783 [00:00<00:00, 2328.66it/s]
[INFO] eye_area: 0.02593994140625 - 0.06561279296875
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
/home/anaconda3/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/anaconda3/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: /home/anaconda3/lib/python3.8/site-packages/lpips/weights/v0.1/alex.pth
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /home/anaconda3/lib/python3.8/site-packages/lpips/weights/v0.1/alex.pth
[INFO] Trainer: ngp | 2023-02-23_12-56-18 | cuda | fp16 | person_video1/
[INFO] #parameters: 3024277
[INFO] Loading latest checkpoint ...
[INFO] Latest checkpoint is person_video1/checkpoints/ngp_ep0256.pth
[INFO] loaded model.
[INFO] load at epoch 256, global step 200448
[INFO] loaded optimizer.
[INFO] loaded scheduler.
[INFO] loaded scaler.
[INFO] load 79 val frames.
[INFO] load aud_features: torch.Size([861, 44, 16])
Loading val data: 100%|███████████████████████████████████████████████████| 79/79 [00:00<00:00, 2280.58it/s]
[INFO] eye_area: 0.0255584716796875 - 0.0614166259765625
[INFO] max_epoch = 320
==> Start Training Epoch 257, lr=0.000050 ...
loss=0.0001 (0.0004), lr=0.000045: : 0% 1/783 [00:01<13:32, 1.04s/it]Traceback (most recent call last):
File "main.py", line 253, in <module>
trainer.train(train_loader, valid_loader, max_epoch)
File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 906, in train
self.train_one_epoch(train_loader)
File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 1169, in train_one_epoch
preds, truths, loss = self.train_step(data)
File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 766, in train_step
loss = loss + 0.01 * self.criterion_lpips(pred_rgb, rgb)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/lib/python3.8/site-packages/lpips/lpips.py", line 119, in forward
outs0, outs1 = self.net.forward(in0_input), self.net.forward(in1_input)
File "/home/anaconda3/lib/python3.8/site-packages/lpips/pretrained_networks.py", line 85, in forward
h = self.slice3(h)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/pooling.py", line 166, in forward
return F.max_pool2d(input, self.kernel_size, self.stride,
File "/home/anaconda3/lib/python3.8/site-packages/torch/_jit_internal.py", line 485, in fn
return if_false(*args, **kwargs)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 782, in _max_pool2d
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: Given input size: (192x2x2). Calculated output size: (192x0x0). Output size is too small
loss=0.0001 (0.0004), lr=0.000045: : 0% 2/783 [00:01<07:52, 1.65it/s]
请问 这个问题解决了吗
请问 这个问题解决了吗
Solved it
怎么解决的呢,我这边也遇到了同样的问题
请问 这个问题解决了吗
解决了
你好,可以分享下你的解决办法吗?我也出现这个错误了。我的视频是512*512像素
Hello, can you share your solution?Thank you very much
Hello, can you share your solution?Thank you very much
Your video is parsing, so you can look at the file face_parsing/test.py
Hello, can you share your solution?Thank you very much
https://github.com/ashawkey/RAD-NeRF/issues/46
Hello, can you share your solution?Thank you very much
Your video is parsing, so you can look at the file face_parsing/test.py
Sorry, what is the purpose of looking at this source code? Do you know how to solve this problem?
Hello, thanks for your nice work. When I run the code on my video, there is a problem appearing.
~/MyCode/RAD_NeRF$ python main.py data/person_video1_25fps_512x512/ --workspace person_video1/ -O --iters 250000 --finetune_lips Namespace(H=450, O=True, W=450, amb_dim=2, asr=False, asr_model='cpierse/wav2vec2-large-xlsr-53-esperanto', asr_play=False, asr_save_feats=False, asr_wav='', att=2, aud='', bg_img='', bound=1, ckpt='latest', color_space='srgb', cuda_ray=True, data_range=[0, -1], density_thresh=10, density_thresh_torso=0.01, dt_gamma=0.00390625, emb=False, exp_eye=True, fbg=False, finetune_lips=True, fix_eye=-1, fovy=21.24, fp16=True, fps=50, gui=False, head_ckpt='', ind_dim=4, ind_dim_torso=8, ind_num=10000, iters=250000, l=10, lambda_amb=0.1, lr=0.005, lr_net=0.0005, m=50, max_ray_batch=4096, max_spp=1, max_steps=16, min_near=0.05, num_rays=65536, num_steps=16, offset=[0, 0, 0], part=False, part2=False, patch_size=1, path='data/person_video1_25fps_512x512/', preload=0, r=10, radius=3.35, scale=4, seed=0, smooth_eye=False, smooth_lips=False, smooth_path=False, smooth_path_window=7, test=False, test_train=False, torso=False, torso_shrink=0.8, train_camera=False, update_extra_interval=1000000000.0, upsample_steps=0, workspace='person_video1/') [INFO] load 783 train frames. [INFO] load aud_features: torch.Size([861, 44, 16]) Loading train data: 100%|███████████████████████████████████████████████| 783/783 [00:00<00:00, 2328.66it/s] [INFO] eye_area: 0.02593994140625 - 0.06561279296875 Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off] /home/anaconda3/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /home/anaconda3/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights. warnings.warn(msg) Loading model from: /home/anaconda3/lib/python3.8/site-packages/lpips/weights/v0.1/alex.pth Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off] Loading model from: /home/anaconda3/lib/python3.8/site-packages/lpips/weights/v0.1/alex.pth [INFO] Trainer: ngp | 2023-02-23_12-56-18 | cuda | fp16 | person_video1/ [INFO] #parameters: 3024277 [INFO] Loading latest checkpoint ... [INFO] Latest checkpoint is person_video1/checkpoints/ngp_ep0256.pth [INFO] loaded model. [INFO] load at epoch 256, global step 200448 [INFO] loaded optimizer. [INFO] loaded scheduler. [INFO] loaded scaler. [INFO] load 79 val frames. [INFO] load aud_features: torch.Size([861, 44, 16]) Loading val data: 100%|███████████████████████████████████████████████████| 79/79 [00:00<00:00, 2280.58it/s] [INFO] eye_area: 0.0255584716796875 - 0.0614166259765625 [INFO] max_epoch = 320 ==> Start Training Epoch 257, lr=0.000050 ... loss=0.0001 (0.0004), lr=0.000045: : 0% 1/783 [00:01<13:32, 1.04s/it]Traceback (most recent call last): File "main.py", line 253, in <module> trainer.train(train_loader, valid_loader, max_epoch) File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 906, in train self.train_one_epoch(train_loader) File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 1169, in train_one_epoch preds, truths, loss = self.train_step(data) File "/home/MyCode/ashawkeyRAD_NeRF/nerf/utils.py", line 766, in train_step loss = loss + 0.01 * self.criterion_lpips(pred_rgb, rgb) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/anaconda3/lib/python3.8/site-packages/lpips/lpips.py", line 119, in forward outs0, outs1 = self.net.forward(in0_input), self.net.forward(in1_input) File "/home/anaconda3/lib/python3.8/site-packages/lpips/pretrained_networks.py", line 85, in forward h = self.slice3(h) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward input = module(input) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/pooling.py", line 166, in forward return F.max_pool2d(input, self.kernel_size, self.stride, File "/home/anaconda3/lib/python3.8/site-packages/torch/_jit_internal.py", line 485, in fn return if_false(*args, **kwargs) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 782, in _max_pool2d return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) RuntimeError: Given input size: (192x2x2). Calculated output size: (192x0x0). Output size is too small loss=0.0001 (0.0004), lr=0.000045: : 0% 2/783 [00:01<07:52, 1.65it/s]
facing the same issue can you please help me @aishoot, will really appreciate the help thank you