Wav2Lip
Wav2Lip copied to clipboard
audio time != mel time
i use 16k mono ,6.7025s wav , mel 167 frame->167 *40 ms=6.68s the same ,i use 6s wav , mel 147 5.58s why? issue:audio also play , the mel have no
# inference.py:328
mel = audio.melspectrogram(wav)
print(mel.shape)
if np.isnan(mel.reshape(-1)).sum() > 0:
raise ValueError(
"Mel contains nan! Using a TTS voice? Add a small epsilon noise to the wav file and try again"
)
mel_chunks = []
mel_idx_multiplier = 80.0 / fps
i = 0
while 1:
start_idx = int(i * mel_idx_multiplier)
if start_idx + mel_step_size > len(mel[0]):
mel_chunks.append(mel[:, len(mel[0]) - mel_step_size :])
break
mel_chunks.append(mel[:, start_idx : start_idx + mel_step_size])
i += 1
melspectrogram会进行填充,mel_chunks的生成逻辑中会忽略末尾的一些mel窗口,这些造成了时间不一致。
我估算的差异应该在15/80-1/25=0.1475s
以内。不太明白为什么你的差异这么大。
你可以debug上面代码分析。