EchoMimic
EchoMimic copied to clipboard
quality drop with non 512x512 width and height (非512x512大小的输出质量变差)
If I modified -W and -H to non-512x512, such as (384,384), (1024, 1024), (256, 256), the lip motion is damaged in different degrees. The most severe setting is under 1024x1024, whole face motion is destroyed. 我在infer_audio2vid_acc.py中,尝试把-W和-H改成256,256,或384,384,或1024,1024,都出现了不同程度的唇动消失问题。最严重的是1024x1024的,已经面目全非了。
-W 384 -H 384: 384x384: https://github.com/user-attachments/assets/b4e85c38-760a-4f10-b87a-826dd4c774d8
-W 256 -H 256: 256x256: https://github.com/user-attachments/assets/137c934b-0c62-4a89-81f1-70b15a7b54c3
-W 1024 -H 1024: 1024x1024: https://github.com/user-attachments/assets/2d02e707-51c4-4eb5-8d4b-9a0575774021