megatts2
megatts2 copied to clipboard
Unoffical implementation of Megatts2
Dear author, are you still working on this project? Megatts2 is quite impressive, and I'm really grateful that you were able to implement it. I'm very much looking forward to...
训练数据
感谢开源! 请问hugging face上的ckpt是在多少小时的数据上训练得到的
大佬的代码简洁至极!发现一些问题:ADM infer部分,当前预测的dt_predict 为小数,而非整数,直接拼在p_code上,作为下一步的输入。这种方式貌似和训练时输入都是整数不相符,有可能出现意想不到的结果。建议改为: dt_predict = torch.round(dt_predict).clamp(1, self.max_duration_token)
hello,你有训练出来的 demo 听听么,看看复现效果如何
Implementation of WebUI for MegaTTS2 ## TODO - [x] Initial Implementation - [ ] Better Layout - [ ] Provide more guidance for beginners - [ ] Launching scripts -...
Hi, thanks very much for sharing your pretrained checkpoint! Wonderful works! Thanks for your contribution! I'd like to know what dataset you trained this model on and if you can...
https://github.com/LSimon95/megatts2/blob/c9ca2a88febf9db2cf4d8da0860efc9948db2b76/modules/mrte.py#L63 data:image/s3,"s3://crabby-images/e77e8/e77e827244f6d213a6b55f39127267bf8bae3234" alt="image" paper:https://arxiv.org/abs/2307.07218 reference: data:image/s3,"s3://crabby-images/4a429/4a429563971acc0ee4ac601fea3cad1b6b3e774e" alt="image"
https://github.com/LSimon95/megatts2/blob/2ab81a1234791e809cdcfca12e3c7b3d0cc2a9f3/modules/datamodule.py#L405
韵律模型推理的时候好像没办法用到promopt音频中提取的prosody code ?