video-retalking
video-retalking copied to clipboard
result bad pixels on mouth
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.
See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! WARNING !!
warnings.warn(WRONG_COMPILER_WARNING.format(
/data/workspace/tts/video-retalking/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py:284: UserWarning:
!! WARNING !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.
See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! WARNING !!
warnings.warn(WRONG_COMPILER_WARNING.format(
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 520
[Step 1] Using saved landmarks.
[Step 2] 3DMM Extraction In Video:: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 520/520 [00:14<00:00, 36.65it/s]
using expression center
Load checkpoint from: checkpoints/DNet.pt
Load checkpoint from: checkpoints/LNet.pth
Load checkpoint from: checkpoints/ENet.pth
[Step 3] Using saved stabilized video.
[Step 4] Load audio; Length of mel chunks: 446
[Step 5] Reference Enhancement: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 446/446 [01:22<00:00, 5.43it/s]
[Step 6] Lip Synthesis:: 0%|
this is the main log, no specific error message.
No face detected in this image███████▉ | 73/520 [00:17<00:13, 32.01it/s]
No face detected in this image
No face detected in this image
No face detected in this image
No face detected in this image
No face detected in this image
the log keep saying no Face detected in this image, but in my input video, there is always a face
similar issue
I tried the same video on google collab , the result is ok, and there is no "No face detected in this image" log, I think on my own machine, some frames' detection failure causing this issue, no idea how to fix it.
https://github.com/OpenTalker/video-retalking/assets/11890900/15556d9b-3f1d-4afe-be9b-32bc4b9c1855
Hi, i have the similar issue, i am using Google Colab, the result is not satisfying, is it because of footage quality or i did something wrong?
https://github.com/OpenTalker/video-retalking/assets/11890900/ad7a8473-400d-4894-87bd-9821b5a3a163
For example here, whats happening to face? Is there any way to avoid dissolve on face?
download.17.mp4 Hi, i have the similar issue, i am using Google Colab, the result is not satisfying, is it because of footage quality or i did something wrong?
in this video, I think it just because video-retalking which inpaint the mouth area needs the mouth pixel not to be shield by any other object like the mic. you should use a mouth-clean video.
download.21.mp4 For example here, whats happening to face? Is there any way to avoid dissolve on face?
this video is mouth-clean, but when the mouth move so fast, the inpainted area will blink, I think it is a Known issue of all video-inpainting technology. you better try a video that the head position not changes so fast.
I tried the same video on google collab , the result is ok, and there is no "No face detected in this image" log, I think on my own machine, some frames' detection failure causing this issue, no idea how to fix it.
got some insight on this issue, I found that FaceEnhancement process(in gpen_face_enhancer.py) first called with some error(without log), before FaceEnhancement.process method be called the input frame looks ok: like this:
but after the enhancer, the output image collapsed:
tried 3 different input video, same issue on my centos machine, still no clue
I tried the same video on google collab , the result is ok, and there is no "No face detected in this image" log, I think on my own machine, some frames' detection failure causing this issue, no idea how to fix it.
got some insight on this issue, I found that FaceEnhancement process(in gpen_face_enhancer.py) first called with some error(without log), before FaceEnhancement.process method be called the input frame looks ok: like this:
but after the enhancer, the output image collapsed:
tried 3 different input video, same issue on my centos machine, still no clue
could you get better result by skipping the face enhancement process
I tried the same video on google collab , the result is ok, and there is no "No face detected in this image" log, I think on my own machine, some frames' detection failure causing this issue, no idea how to fix it.
got some insight on this issue, I found that FaceEnhancement process(in gpen_face_enhancer.py) first called with some error(without log), before FaceEnhancement.process method be called the input frame looks ok: like this:
but after the enhancer, the output image collapsed:
tried 3 different input video, same issue on my centos machine, still no clue
could you get better result by skipping the face enhancement process
I drop the result of enhancer in the pre process before datagen in inference.py, though the final synthesized video result looks ok by my bare eyes. But still want to know why....