执行报错:The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4
The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4
我上传的文件是png格式.500K左右
音频是flac格式 1.8M左右
执行过程:
启动服务:python3 -u webgui.py --server_port=3000
通过页面上传图片和音频
图片:
信息:
To create a public link, set share=True in launch().
video in 24 FPS, audio idx in 50FPS
whisper_chunks: (266, 50, 384)
audio_fea_final: torch.Size([1, 266, 50, 384])
ref_image_latents shape: torch.Size([1, 4, 64, 64])
face_mask_tensor shape: torch.Size([1, 1, 1, 1024, 1024])
face_locator_tensor shape: torch.Size([2, 320, 1, 128, 128])
0%| | 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/route_utils.py", line 288, in call_process_api
output = await app.get_blocks().process_api(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/blocks.py", line 1931, in process_api
result = await self.call_function(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/blocks.py", line 1516, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
response = f(*args, **kwargs)
File "/root/EchoMimic/webgui.py", line 233, in generate_video
final_output_path = process_video(
File "/root/EchoMimic/webgui.py", line 175, in process_video
video = pipe(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/EchoMimic/src/pipelines/pipeline_echo_mimic.py", line 507, in call
pred = self.denoising_unet(
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/echomimic/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EchoMimic/src/models/unet_3d_echo.py", line 494, in forward
sample = sample + face_musk_fea
RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 4
same problem +1
试一下这个方法 A bug was found here, some jpeg format found no face here #137
我也有这个问题
me too
det_bboxes, probs = face_detector.detect(cv2.cvtColor(face_img, cv2.COLOR_BGR2RGB))
可能因为图片太大了, 我刚才把图片尺寸缩小了一下, 到80kb左右就能跑了, 原来400kb报错跟你一模一样