adelacvg

Results 45 comments of adelacvg

Yes, and it's exactly what I'm working on. You can have a look [here](https://github.com/adelacvg/NS2VC/tree/v4) for a rough idea of the approach. The main idea is to utilize Referencenet to enhance...

I do not recommend training with v3 because it still uses inefficient modules like FiLM for timbre addition. As for training with v4, all I can say is that the...

I trained using the same dataset as v2, which is a mixed dataset containing both Chinese and English.

The current code is trainable, and I have obtained some promising results. It's worth noting that the convergence is slow, and a batch size of 32 takes about 500k steps...

I only used 300 hours of data, and the training was done exclusively on two GeForce RTX 3090 GPUs.

In the README of the current v4 branch, I have introduced how to train on any language. You can try it according to the description.

https://github.com/152334H/tortoise-tts-fast This repository employs a faster sampling method for the diffusion model. Additionally, Xtts utilizes HiFIGAN to replace the diffusion model, resulting in significantly faster inference.

支持,小数据泛化不好,但集内表现没有问题,想要好的泛化建议在大的预训练模型上fine tune。

@ZJ-CAI 其他音色转到你的音色会有比较好的效果,你的音色转到一些集外音色还要看训练集大小。想要更好的集外泛化性可以尝试训练v4模型。

You haven't done anything wrong. Due to the model v4 having over 200 million parameters, the training process is very slow. I am currently experimenting with features such as offset...