Liumeng Xue comments

Results 25 comments of


                                            Liumeng Xue

Long-form synthesis

> Currently we train on a maximum of 30-second audios. With @ylacombe we're looking at increasing the context length to potentially longer audio lengths. Alibi embeddings (or a variant thereof)...

Please format the code using 'black' formatter as described [here](https://github.com/open-mmlab/Amphion/blob/main/.github/CONTRIBUTING.md). Please provide the final checkpoints and samples. Additionally, the commit information should be concise.

In the future, will this repo support advanced models like XTTS-v2, VITS2, StyleTTS2?

We would like to integrate more models, support reproducible research, and contribute to the community. We also warmly welcome everyone to participate.

[Error] Please specify the running stage

Sorry, but I can't reproduce the error you mentioned. Could you please provide more details or error messages?

An issue with the preprocessing part of LibriTTS.

Thanks for your feedback. We have updated the LibriTTS dataset processor in this PR https://github.com/open-mmlab/Amphion/pull/25 , where this bug is fixed. So please pull the latest code and try it...

An issue with the preprocessing part of LibriTTS.

Thanks for your feedback. If you only want to extract phoneme sequences, you can comment out lines 105-219 in the file located at https://github.com/open-mmlab/Amphion/blob/main/bins/tts/preprocess.py. This will skip the previous processing...

An issue with the preprocessing part of LibriTTS.

The `dataset_types` variable is defined in lines 140-146, so you should keep them uncommented.

An issue with the preprocessing part of LibriTTS.

Please check `egs/tts/VALLE/exp_config.json` in this PR https://github.com/open-mmlab/Amphion/pull/52/files#

An issue with the preprocessing part of LibriTTS.

> It feels like Amphion has been in a constant process of revision, which is really confusing for a beginner like me. I've been constantly using "git pull origin" to...

An issue with the preprocessing part of LibriTTS.

> 1、i found that the audio generated by VALLE is very poor. it will has this warning "WARNING | phonemizer | words count mismatch on 200.0% of the lines". Is...