Puyuan Peng

Results 97 comments of Puyuan Peng
trafficstars

If you finetune our released model, please use the released same encodec I train a encodec myself because this way i can have fewer codebooks for a high fidelity resynthesis

TTS inference is in batch mode, it's just right now running the same transcript and selecting the shortest. Need some tweak in the code to support batch processing

https://huggingface.co/datasets/speechcolab/gigaspeech

1. the dataset sounds good 2. this is something worth exploring 3. would need exp to see. making it more efficient is a good research/engineering problem

Thank you so much for this, I tried to change `input_audio.change` to `input_audio.upload` in `gradio_app.py` so that it supports user uploaded audio, but after hitting the Run button it will...

> It worked for me but I'm not sure why it was locked to only the demo audio. I just made it editable and then the UI works. By making...

are you sure? probably also need to change `input_audio.change` to `input_audio.upload` right, otherwise it will give an error when up clear the original audio

> New version is great. The voices sound a lot better. Only thing is that I get a bit of the last word it's continuing from in my actual prompt...

Thanks for the amazing work! One thing I realized is that once you hit Run, if you want to change something, i.e. upload a different audio file, or change a...

> words count mismatch phonemizer warning 'words count mismatch phonemizer warning' is complete fine