whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

what's the model using for keywords spotting?

Open lucasjinreal opened this issue 1 year ago • 1 comments

what's the model using for keywords spotting?

lucasjinreal avatar Dec 12 '22 06:12 lucasjinreal

I don't think whisper has this feature, you have to roll your own. I created an "error dictionary", which is a vector of pairs of 1) The keyword you want to look for, and 2) a vector of incorrect versions of the desired keyword. Whisper has trouble with some words; I think this is the downside to using AI on a huge dataset of random voices, as opposed to a software package that you train for specific phrases with your voice. One example is the word "routing". Whisper has all kinds of trouble with this word, so my dictionary entry looks like this: pair<CString,vector<CString>> word; word.first="routing"; word.second= {"browning","roding","rowding","rolling","loading","roting","drowning" ,"broadings","woding","broughting","broughtings","brooding","brody","roating","wildings" ,"bowering","roadings","borrowing","rowing","welding","rowling","boding" ,"rotting","brought in","noting","floating","rowdy","ronin","bauding" ,"grouting","rodin","boring","roaring","broding","brought","rowering" ,"blowing","brawing","floating","running","growing","browning","boarding","bouting" ,"row","rodding"};

Of course there are more elegant ways to do this using XML, databases, bin dictionary file, etc, but I have a limited set of keywords, so this is how I handle it. For me, Whisper is a little like squashing a fly with a sledge hammer. I might be better off with something that you train with your voice for specific commands, however I also need numeric voice input, like "speed 30" and stuff like that, and Whisper seems to be working good enough.

RndyP avatar Dec 29 '22 16:12 RndyP