VoiceCraft HF space build is broken

HF space build is broken

Open fredvb opened this issue 1 year ago • 3 comments

The HuggingFace space seems to fail to build with the latest changes. Any clues on what might be causing this?

Apr 23 '24 16:04 fredvb

I'm working on it, in the mean time, please use Gradio through Google Colab https://colab.research.google.com/drive/1IOjpglQyMTO2C3Y94LD9FY0Ocn-RJRg6?usp=sharing

Apr 23 '24 16:04 jasonppy

Does the suggestion mentioning the prompt duration and total duration still apply?

`Don't make your prompt end time too long, 6-9s is fine. Or else it will either raise up JSON issue or cut off your generated audio. This one is due to how VoiceCraft worked (so probably unfixable). It will add those text you want to get audio from at the end of the input audio transcript. It was way too much word for application or code to handle as it added up with original transcript. So please keep it short.

Your total audio length (prompt end time + add-up audio) must not exceed 16 or 17s.`

From what I get, the Long-TTS option generates multiple audio files and then concatenates them, is that correct? In this case, I assume each phrase (newline) should stay within the boundaries?

Apr 23 '24 16:04 fredvb

HF spaces is up and running.

I uploaded the colab notebook to reflect longer duration supported by newer TTS enhanced models. I personally found that 3~4s is usually enough. for Long-TTS, just need to make sure each sentence (along with the prompt) doesn't exceed 20s

Apr 23 '24 19:04 jasonppy

VoiceCraft VoiceCraft copied to clipboard

HF space build is broken

VoiceCraft
VoiceCraft copied to clipboard