StyleTTS2
StyleTTS2 copied to clipboard
Audio Length Customization
I want to find a way to increase the 300-second limit. Is there any way to do that?
The 300-second limit can be bypassed by addressing the underlying constraint of BERT's maximum token length of 512. You can tokenize your text into sentences and process them one by one. For each chunk, generate the corresponding audio and then stitch the audio together to create a seamless output of any desired length.