piper-phonemize Multiple Phonemizer Support

Multiple Phonemizer Support

Open kbickar opened this issue 1 year ago • 12 comments

The piper-phonemizer setup is a bit confusing at the moment as it's both a included with some significant code and a library imported at runtime. The two phonemizers text and espeak are both tightly integrated in piper and piper-phonemize. Furthermore, they are linked with the espeak-ng library which has the GPL license meaning piper-phonemize is also under the GPL license (when distributed) and thus also piper is under the GPL license.

My proposal is this:

Create a standard interface for a phonemizer between piper/piper-phonemize. This could be 3 functions: initialize, phonemize, terminate. The initialize could also pass in configuration data if required.
Have the phonemizer be selectable at startup via a flag instead of from the voice config. I'm not sure technically if there's a reason the phonemes are configured in the voice .json file, but it seems like that's not entirely necessary as long as the phonemes match.
Separate the phonemizers within piper-phonemize to be different libraries that are loaded only if the configuration requires it. For example on Linux to phonemize text into a vector of phonemes using espeak:

        auto libraryHandle = dlopen("phoenmizer_espeak.so", RTLD_LAZY);
        auto phonemizeFn = (void (*)(const std::string, std::vector<std::vector<Phoneme>>&))GetProcAddress(static_cast<HMODULE>(libraryHandle), "phonemize");
        phonemizeFn(text, phonemes);

This would allow an easy way to integrate a new phonemizer without updating both programs and even allows a new library to be added without updating piper-phonemize. Plus, the dependency on espeak-ng would be optional which means it could be distributed under the much more permissive MIT license.

I can implement some of the changes to do this, but as it would be a fairly substantial change, I thought it would be best to discuss it first

Oct 17 '23 14:10 kbickar

piper-phonemize piper-phonemize copied to clipboard

Multiple Phonemizer Support

piper-phonemize
piper-phonemize copied to clipboard