Thank for this :)
Over the last few days, I've been testing a number of tools, including TTS software. My choice was Kokoro and its only French voice, Siwis. The result is excellent, but with my configuration it takes more than 25-30 seconds to generate an audio file.
Then I remembered that 2-3 years ago I'd used a plugin (based on Piper) for RuneScape that read NPC text in less than a second with excellent quality. I've just tested it to compare it with Kokoro and ... Piper is an absolute marvel! The voice (Siwis) is very slightly more robotic and some words are little less well pronounced than with Kokoro, but considering the time saved with Piper, I'll keep Piper.
Too bad you don't have time to improve Piper. In my opinion, it has the potential to become the TTS open source reference, and it's only 60 MB! (more than 6 GB for Kokoro ...)
PS: Is it easy to create a finetune? I'd like to try and create a better quality Siwis model if possible.
Piper is still being developed 🙂 https://github.com/OHF-Voice/piper1-gpl It's not too bad to fine tune a voice, but it's very dependent on your setup: https://github.com/OHF-Voice/piper1-gpl/blob/main/docs/TRAINING.md
Thanks, I'll try that :)
Edit 1: This command (python3 -m pip install -e .[train]) return an error : zsh: no matches found: .[train]
Edit 2: Solved with python3 -m pip install -e ".[train]"