wav2lip-hq the result like this

the result like this

Why it looks so different from you：

Jul 07 '21 02:07 425776024

me too.. my result is horrible..

raw: after:

I am use your guys Google Colab demo

Jul 08 '21 19:07 MAYBreath

Why it looks so different from you：

The quality can decrease if the speech you are using for inference is way different from the data from the training set, which included a calm speech in the Russian language. Also, using another model can help. For instance, ESRGAN available via this link was finetuned on the video of the particular person you are applying the model to. Using it instead of the default model provided in Google Colab notebook may increase the quality.

Jul 08 '21 20:07 Markfryazino

me too.. my result is horrible..

Unfortunately, as it is stated in the readme, the training set didn't contain enough data, so the model is not able to generalize well. The videos in the training set looked different from the screenshot you have shared: for instance, all of them had a white background, whilst the background of your photo is of another color. To obtain good results, please finetune the model.

Jul 08 '21 20:07 Markfryazino

me too.. my result is horrible..

Unfortunately, as it is stated in the readme, the training set didn't contain enough data, so the model is not able to generalize well. The videos in the training set looked different from the screenshot you have shared: for instance, all of them had a white background, whilst the background of your photo is of another color. To obtain good results, please finetune the model.

thank you for reply,I got it.

Jul 08 '21 20:07 MAYBreath

I use video with white background. The quality of lip-sync clip is not better. I use English audio.

Screenshot 2021-07-20 12 28 31

Jul 20 '21 05:07 andyvha

wav2lip-hq wav2lip-hq copied to clipboard

the result like this

wav2lip-hq
wav2lip-hq copied to clipboard