UEAzSpeech
UEAzSpeech copied to clipboard
Mismatch between viseme and audio data
Sometimes, when TTS is working, it needs to consume twice as much time as usual, but the generated WAVESOUND duration is correct, which leads to a doubling of the entire VISEME timeline length, but the audio is normal. Therefore, there may be a mismatch between VISEME data and audio. What is the reason for this?