whisper.cpp
whisper.cpp copied to clipboard
Specialized vocabulary
I am interested in using the streaming tool for a very specialized context (radiology dictation), which uses an esoteric and relatively restricted vocabulary (i.e. high likelihood of words like "hyperpneumatization" or "temporooccipital", but never common words like "book" or "spoon"). I have seen references in the whisper documentation to an initial_prompt option to steer the model towards certain terms, but I am not sure if this is feasible to pass in a relatively large corpus of high likelihood, but generally uncommon, words.
In the standard configuration, accuracy for some radiology reports I've tested is fairly poor, preferring more common words over the correct uncommon word. I wonder if anyone has thoughts about this problem.
Well one interesting idea I've seen would be to add an "IPA-language" That way you'd get a approximate representation of how the spoken words sound like, and can then determine what those words actually are.