piper icon indicating copy to clipboard operation
piper copied to clipboard

GLaDOS voice

Open apiote opened this issue 2 years ago • 35 comments

Is there a GLaDOS voice for pipers as it was for larynx (https://github.com/rhasspy/larynx/issues/56)? Or possibly an easy way to convert one to another? I added phonemes and missing entries in the json file, but still there are phonemes missing and errors about the model

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: scales for the following indices
 index: 0 Got: 3 Expected: 2
 Please fix either the inputs or the model.

apiote avatar Aug 28 '23 14:08 apiote

Tomorrow I will train a Glados dataset, but what worries me is the license to publish it.

rmcpantoja avatar Sep 04 '23 01:09 rmcpantoja

That's what I was afraid of. Would instructions to train a dataset on one's own be more in the clear? I have no idea about hardware requirements, though.

apiote avatar Sep 04 '23 09:09 apiote

To make things easier, I use colab notebooks, since I don't have the hardware. To run it locally, you would need an NVIDIA GPU and the parameters (eg batch_size) can be run according to the capabilities of your GPU.

rmcpantoja avatar Sep 04 '23 12:09 rmcpantoja

I don't have the hardware either. And I guess if detailed instructions were published, that could still get DMCA'd as did tools like yt-dl

apiote avatar Sep 11 '23 09:09 apiote

@rmcpantoja any update on the Glados training?

dnhkng avatar Nov 04 '23 20:11 dnhkng

@rmcpantoja any update on the Glados training?

Hi @dnhkng, I have two GLaDos models made, one in Spanish and the other in English through my colab notebooks, but unfortunately since they are datasets with a lot of corpus, they require more training and I do not have the resources to buy colab pro. They are the following: English and Spanish

rmcpantoja avatar Nov 08 '23 16:11 rmcpantoja

@rmcpantoja The English link doesn't work. I was going to try a finetune on the original game voice data. I have 2x 4090s, so I should have enough compute.

I could rip the voices from https://theportalwiki.com/wiki/GLaDOS_voice_lines but is there a dataset with this already prepared? Happy to share the results!

dnhkng avatar Nov 08 '23 16:11 dnhkng

Hi @dnhkng, It sounds strange, I am able to open the Drive folder with the English model without problems. Anyway, here I've a model exported to onnx

The model was trained using this dataset, but I was in charge of fixing many incorrect transcriptions.

rmcpantoja avatar Nov 10 '23 13:11 rmcpantoja

@rmcpantoja Thanks for the export! I found the checkpoint file eventually though, sorry! Sounds pretty good, I see it trained on colab for 2.25 hours.

I scrapped the GLaDOS dataset (only using the Portal 2 voice and DLC), manually filtered out all the wav files that contained extras (Laughing, telephones, beeping, etc), and also fixed all the text. That gave me about 1 hour of high-quality data. I have currently fine-tuned for 15 hours on a 4090, and it sounds very good, and the loss is still decreasing. I will train for 24 hours, and see how the loss curves look.

EDIT: Here is a samples after 24H of finetuning. 'a' is the generated sample, 'b' is an unseen sample from the the game. https://drive.google.com/drive/folders/1WVpS2zlJ9JqXIYV8Fkjoy5Fjz-eWPaEh?usp=sharing

I think the generated sample is better! Kudos to Piper, this is amazing!

dnhkng avatar Nov 13 '23 14:11 dnhkng

@dnhkng Hello is it possible to get the model ?

RoxBlox3 avatar Nov 26 '23 10:11 RoxBlox3

@dnhkng Hello is it possible to get the model ?

Yes, I will share it in the next few days. Doing a big refactor on the inference code.

dnhkng avatar Nov 28 '23 14:11 dnhkng

Sign me up as well

takov751 avatar Dec 07 '23 10:12 takov751

@dnhkng Any update on the model ?

RoxBlox3 avatar Dec 10 '23 20:12 RoxBlox3

OK, the model is available here: https://github.com/dnhkng/GlaDOS

You can find the GlaDOS model in the models directory.

It includes my new code base to use the voice. Have a look in the Jupyter Notebook on how to use it.

If you instead want to use it with Piper, just take a medium size model, and copy the .onnx.json file, and rename it as glados.onnx.json, and it will run with Piper.

dnhkng avatar Dec 11 '23 08:12 dnhkng

Thank you very much for your work @dnhkng 👍👍👍

takov751 avatar Dec 11 '23 09:12 takov751

For those of you who want to run GlaDOS onnx model on iOS, Android, Raspberry Pi, or use C++, C, Go, C#, Kotlin, Swift, Python, Java, or on Windows, Linux, macOS, etc, please have a look at https://github.com/k2-fsa/sherpa-onnx

We provide a colab to show you how to convert the GlaDOS model to sherpa-onnx https://colab.research.google.com/drive/1m3Zr8H1RJaoZu4Y7hpQlav5vhtw3A513?usp=sharing

The following is a sample command using the converted model with sherpa-onnx

# You can also use sherpa-onnx-offline-tts-play

sherpa-onnx-offline-tts \
  --vits-model=./glados.onnx \
  --vits-tokens=./tokens.txt \
  --vits-data-dir=./espeak-ng-data \
  --output-filename=./test-glados.wav \
  "How are you doing? This is a text-to-speech application using next generation Kaldi."

https://github.com/rhasspy/piper/assets/5284924/9e75eca4-b73d-46a4-b0c4-e88df3a2ae4b

csukuangfj avatar Dec 13 '23 04:12 csukuangfj

By the way, I just managed to build Android APKs for the pre-trained GLaDOS models mentioned in this issue, i.e, for the following two models:

  • A Spanish model from https://drive.google.com/file/d/12tNCCyd0Hf5jsyqCw8828kLSHHx5LOw9/view?usp=drive_link
  • An English model from https://github.com/dnhkng/GlaDOS

You can find the APKs at https://k2-fsa.github.io/sherpa/onnx/tts/apk.html

Screenshot 2023-12-13 at 15 30 29

For your convenience, the download address is given below:

If you are interested in how we build the APK, please read the following documentation https://k2-fsa.github.io/sherpa/onnx/android/index.html


You can also try the models in the following huggingface space in your browser

http://huggingface.co/spaces/k2-fsa/text-to-speech

Screenshot 2023-12-13 at 15 33 08

Screenshot 2023-12-13 at 15 34 24

csukuangfj avatar Dec 13 '23 07:12 csukuangfj

could someone also train a german version please?

LaneaLucy avatar Nov 21 '24 13:11 LaneaLucy

@LaneaLucy I would, but I'm not sure how to do it! I trained using the voice from the game. Is there a German voice in the German edition?

(I live in Bavaria, which is almost German :) )

dnhkng avatar Nov 21 '24 15:11 dnhkng

@dnhkng yes, there is german voice in portal 1 and portal 2

LaneaLucy avatar Nov 23 '24 13:11 LaneaLucy

@LaneaLucy I am doing new voice training soon, join my discord https://discord.com/invite/ERTDKwpjNB and in the general section I'm discussing it now.

dnhkng avatar Dec 02 '24 09:12 dnhkng

@dnhkng Your GlaDos model is amazing! You should also create a Cave Johnson and Wheatley model. Can you imagine having multiple devices speaking to each other with the different voices? 🤣 https://theportalwiki.com/wiki/Cave_Johnson_voice_lines https://theportalwiki.com/wiki/Wheatley_voice_lines

HeedfulCrayon avatar Jan 08 '25 18:01 HeedfulCrayon

It's not clear to me what I have to add to which directories in the Piper data folder. I don't see any large file with extension ONNX in that repository. What gives?

Rudd-O avatar Jan 28 '25 23:01 Rudd-O

FYI the model files can now be found on the releases page of that project. Also this guide was very helpful helpful in adding a custom voice to Piper running in HA

threesquared avatar Feb 04 '25 22:02 threesquared

FYI the model files can now be found on the releases page of that project. Also this guide was very helpful helpful in adding a custom voice to Piper running in HA

This release page seems to be missing the .json file

maxi1134 avatar Feb 06 '25 15:02 maxi1134

@maxi1134 it's in models/TTS.

Only big model weights files are in releases.

dnhkng avatar Feb 06 '25 17:02 dnhkng

@maxi1134

FYI the model files can now be found on the releases page of that project. Also this guide was very helpful helpful in adding a custom voice to Piper running in HA

This release page seems to be missing the .json file

Did you find a .json file to use with this?

kn4thx avatar Feb 10 '25 18:02 kn4thx

@maxi1134

FYI the model files can now be found on the releases page of that project. Also this guide was very helpful helpful in adding a custom voice to Piper running in HA

This release page seems to be missing the .json file

Did you find a .json file to use with this?

I just quickly searched the repo and found a json file in models/TTS/glados.json that looks similar to what piper expects: https://github.com/dnhkng/GLaDOS/blob/main/models/TTS/glados.json

systemofapwne avatar Feb 15 '25 23:02 systemofapwne

@dnhkng Your GlaDos model is amazing! You should also create a Cave Johnson and Wheatley model. Can you imagine having multiple devices speaking to each other with the different voices? 🤣 https://theportalwiki.com/wiki/Cave_Johnson_voice_lines https://theportalwiki.com/wiki/Wheatley_voice_lines

FYI: I just trained the German glados voice the recent days and it is quite promising. I plan to release it (on hugggingface), including my toolchain and the "training data" (well, rather how to extract the voice lines from the game and my selection on "good" samples).

https://github.com/user-attachments/assets/e4b7f92b-cc7c-40d0-95b9-c21c35acaf43

Edit: Here is my trained model: https://huggingface.co/systemofapwne/piper-de-glados

systemofapwne avatar Feb 28 '25 22:02 systemofapwne

How do I use this in the container version of Piper Wyoming?

Rudd-O avatar Apr 08 '25 13:04 Rudd-O