Abandoned project?
Model seems promising, combining low latency and high quality.
For some reason the creators chose to release it with a very low quantity of default voices, making fine tuning mandatory.
And now this project feels completely abandoned, MyShell being seemingly busy on other stuff.
As far as I can tell the main reason this model exists is to serve as a foundation for training custom voices that can be sold on the marketplace at myshell.ai
It seems so. Unfortunately a lot of people are having problems finetuning a successful custom voice, at least from some of the issues in the repo. Piper, for example, is another on-device TTS system, but using older VITS. This one uses improved VITS2 and some other archs. There seems to be no problems finetuning the Piper checkpoints.