vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Add custom vocabulary to English model vosk-model-en-us-0.22 - STEP BY STEP GUIDE

Open apostolistzimas opened this issue 1 year ago • 7 comments

Is it possible to add custom words to vosk-model-en-us-0.22 using Linux and then run the model in Python in Windows 10? The model is pretty good, but it does not recognizes properly names and company names that are not so well-known. Please anyone who knows may provide a step by step tutorial and post it here as I am new to Linux and things seems too complicated for me. I have read documentation but most stuff there are pretty hard for a new Linux user. Thanks.

apostolistzimas avatar Jul 29 '22 12:07 apostolistzimas

We do not have step-by-step guide. If you need help you can try yourself first and ask specific questions. You can also contribute a guide for us, we will fix the issues we see in it.

Before starting you can also get a generic Linux course, it will help you.

nshmyrev avatar Jul 29 '22 12:07 nshmyrev

Ok that is amazing. I am going to make a dictionary of words and their phonemes and try to add them to the vosk-model-en-us-0.22 model. I will also send to you the files to use them to enrich your model (if you want). Also if you want any help with your Greek model, language-wise I am willing to help - it is my native language. Thank you.

apostolistzimas avatar Jul 29 '22 13:07 apostolistzimas

That sounds great!

nshmyrev avatar Jul 29 '22 13:07 nshmyrev

For extending and reducing the lexicon, the update packages are very convenient, see https://alphacephei.com/vosk/lm But you will need some Linux knowledge ...

svenha avatar Aug 05 '22 13:08 svenha

First of all thanks for all responses. Question: In vosk-model-en-us-0.22 I can see words.txt file inside graph folder, which contains words followed by a unique number. However, I haven't yet found where are the corresponding phonemes of these words. Are they in a text file as well, which I can modify manually to add new words?

apostolistzimas avatar Aug 05 '22 14:08 apostolistzimas

@apostolistzimas No, it is not that simple. This is the reason why update packages exist.

svenha avatar Aug 05 '22 14:08 svenha

apostolistzimas hey are you get the success of adding new word in vosk modal

ram-acke avatar Mar 07 '24 06:03 ram-acke