tortoise-tts
tortoise-tts copied to clipboard
Due to recent development, maybe release training components/scripts?
I know the author did not want to share the training procedure due to ethical reasons. But looking at the development (for example VITS) isnt that policy a bit outdated? There are already tools out there which perform on the same level or even better which can be trained. I think there is no point in blocking further development. What is your view on that?
The training procedure is available, someone figured it out.
Yeah I found a high level explanation, but to my knowledge it is still necessary to train all the models because the codebook apparently does not exist anymore. I do not fully understand all the complex subparts of the model, but apprently without this piece training can only be done by training end-to-end. Thats probably not an issue for people with expensive hardware, but for hobbyists like me its quite a limitation not to be able to do finetuning. Nevertheless thanks for all the effort in putting a project like this up! I really appreciate it that people still produce quality open source stuff in a time where A.I. is dominated by the industry more and more :)
No I mean someone else (not me) put together a repo that allows you to fine tune the model.
Ah I see I found the repo I think. I have access to a few a100 and will try to train and/or fine tune the tortoise model on my native language. If I succeed I will make it publically available!
No I mean someone else (not me) put together a repo that allows you to fine tune the model.
Hey, can you send the link of that repo?
No I mean someone else (not me) put together a repo that allows you to fine tune the model.
Hey, can you send the link of that repo?
I'd like to see it too, thanks
Hey, I'd like to see the repo, too. Thanks!
@asusdisciple you talked about wanting open sourced training code but went underground yourself?
@RahulBhalley @hairuo55 @Shivamkumar285 I think I've found it, but not sure
mrq/ai-voice-cloning it's not on github, google - first link in search results
Thanks @NikitaKononov. The wiki information looks very very promising! Hoping it to run fast on M1 chip as well. 🤞🏽
You can't train this models on cpu or m1 , or low end gpus you'll need a bigger gpus , you can fine-tune it on lower end gpus with 15 -24 GB vram.
I have access to an RTX 4090. Only problem is inference in production environments. This repo's Tortoise inference speed is sadly insanely slow!
@NikitaKononov Thanks a lot for providing the information of mrq/ai-voice-cloning.
@neonbjb Hi, would you mind considering releasing the original scripts, since someone has already figured out the training scripts? All training scripts right now are licensed under GPL/AGPL, and I was wondering if the official ones could be released under Apache 2.0. Thanks!
@fakerybakery I've managed to get fine-tuning for the autoregression model working, using some of the original author's code and DLAS library. I'm in the process of cleaning it up a bit more, but a working version is here: https://github.com/andrewsilva9/tune_tortoise_autoregressor (uses Apache 2.0 because I inherited code from the original author so copied forward the license).
Great! Thanks!
Also @andrewsilva9 you mentioned in the readme that some code comes from the mrq repo, does this mean the agpl license applied?
Let me double check my code to be sure, but I think everything is adapted/copied from DLAS or TorToiSe. I leaned heavily on the mrq repo to see how things connect and to track things through a debugger. I'll double check and update the README/scripts tomorrow!