Liumeng Xue

Results 25 comments of Liumeng Xue

> Currently we train on a maximum of 30-second audios. With @ylacombe we're looking at increasing the context length to potentially longer audio lengths. Alibi embeddings (or a variant thereof)...

Please format the code using 'black' formatter as described [here](https://github.com/open-mmlab/Amphion/blob/main/.github/CONTRIBUTING.md). Please provide the final checkpoints and samples. Additionally, the commit information should be concise.

We would like to integrate more models, support reproducible research, and contribute to the community. We also warmly welcome everyone to participate.

Sorry, but I can't reproduce the error you mentioned. Could you please provide more details or error messages?

Thanks for your feedback. We have updated the LibriTTS dataset processor in this PR https://github.com/open-mmlab/Amphion/pull/25 , where this bug is fixed. So please pull the latest code and try it...

Thanks for your feedback. If you only want to extract phoneme sequences, you can comment out lines 105-219 in the file located at https://github.com/open-mmlab/Amphion/blob/main/bins/tts/preprocess.py. This will skip the previous processing...

The `dataset_types` variable is defined in lines 140-146, so you should keep them uncommented.

Please check `egs/tts/VALLE/exp_config.json` in this PR https://github.com/open-mmlab/Amphion/pull/52/files#

> It feels like Amphion has been in a constant process of revision, which is really confusing for a beginner like me. I've been constantly using "git pull origin" to...

> 1、i found that the audio generated by VALLE is very poor. it will has this warning "WARNING | phonemizer | words count mismatch on 200.0% of the lines". Is...