Custom Whisper Model
I regularly write in a "small" language (Swedish) and up until Whisper Large v3, I really didn't have any good way to do speech recognition. But as revolutionary as Whisper has been, for smaller languages it still leaves a lot to be desired. In the case of Swedish, our National Library, Kungliga Biblioteket, has released a Whisper model trained specifically on Swedish and it reduces the WER (Word Error Rate) by an average of 47 % compared to whisper-large-v3.
Here is a link to get more information:
https://huggingface.co/KBLab/kb-whisper-large
Any chance this could be made available within SpeechNote?
Hi. Thanks for letting me know about these models. I wish every national library would do something similar.
I don't speak Swedish, but I have tested on a reference audio sample and even the "Tiny" model is quite capable. Really impressive.
I added them as "KBLab" models in f649df08139b56287b90d620c8f9503de31766b2. Both WhisperCpp and FasterWhisper.
If you don't want to wait for the next version of Speech Note, you can enable them manually by changing the models.json file. To do so, edit ~/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/models.json and add the following (e.g. as the last models):
{
"name": "Svenska (FasterWhisper KBLab Tiny)",
"model_id": "sv_fasterwhisper_kblab_tiny",
"engine": "stt_fasterwhisper",
"lang_id": "sv",
"checksum": "94b5299a",
"checksum_quick": "51a9f986",
"size": "80549811",
"comp": "dir",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/model.bin",
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/config.json",
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/tokenizer.json",
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/vocabulary.json",
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/preprocessor_config.json"
]
},
{
"name": "Svenska (FasterWhisper KBLab Base)",
"model_id": "sv_fasterwhisper_kblab_base",
"engine": "stt_fasterwhisper",
"lang_id": "sv",
"checksum": "3569ac76",
"checksum_quick": "d3683759",
"size": "150229133",
"comp": "dir",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/model.bin",
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/config.json",
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/tokenizer.json",
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/vocabulary.json",
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/preprocessor_config.json"
]
},
{
"name": "Svenska (FasterWhisper KBLab Small)",
"model_id": "sv_fasterwhisper_kblab_small",
"engine": "stt_fasterwhisper",
"lang_id": "sv",
"checksum": "30e7cb70",
"checksum_quick": "829002d0",
"size": "488558571",
"comp": "dir",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/model.bin",
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/config.json",
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/tokenizer.json",
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/vocabulary.json",
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/preprocessor_config.json"
]
},
{
"name": "Svenska (FasterWhisper KBLab Medium)",
"model_id": "sv_fasterwhisper_kblab_medium",
"engine": "stt_fasterwhisper",
"lang_id": "sv",
"checksum": "e9583557",
"checksum_quick": "bd9bd507",
"size": "1532917950",
"comp": "dir",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/model.bin",
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/config.json",
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/tokenizer.json",
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/vocabulary.json",
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/preprocessor_config.json"
]
},
{
"name": "Svenska (FasterWhisper KBLab Large)",
"model_id": "sv_fasterwhisper_kblab_large",
"engine": "stt_fasterwhisper",
"lang_id": "sv",
"checksum": "e3c88aeb",
"checksum_quick": "faeebbe6",
"size": "3092296021",
"comp": "dir",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/model.bin",
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/config.json",
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/tokenizer.json",
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/vocabulary.json",
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/preprocessor_config.json"
]
},
{
"name": "Svenska (WhisperCpp KBLab Tiny)",
"model_id": "sv_whisper_kblab_tiny",
"engine": "stt_whisper",
"lang_id": "sv",
"checksum": "4f3e9de4",
"checksum_quick": "15b18cb2",
"size": "29883930",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-tiny/resolve/fb77c9949fde44d50255f6462f70c6d67621af11/ggml-model-q5_0.bin"
]
},
{
"name": "Svenska (WhisperCpp KBLab Base)",
"model_id": "sv_whisper_kblab_base",
"engine": "stt_whisper",
"lang_id": "sv",
"checksum": "daca9acb",
"checksum_quick": "4e7bca22",
"size": "55303642",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-base/resolve/35e1b469c4241867835daf57254ade4bed1f1d4c/ggml-model-q5_0.bin"
]
},
{
"name": "Svenska (WhisperCpp KBLab Small)",
"model_id": "sv_whisper_kblab_small",
"engine": "stt_whisper",
"lang_id": "sv",
"checksum": "702785de",
"checksum_quick": "4c18184c",
"size": "175217872",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-small/resolve/f516f51f3cb3782e28d41a22ccd1cd7df17ee515/ggml-model-q5_0.bin"
]
},
{
"name": "Svenska (WhisperCpp KBLab Medium)",
"model_id": "sv_whisper_kblab_medium",
"engine": "stt_whisper",
"lang_id": "sv",
"checksum": "616d248a",
"checksum_quick": "6da49dbe",
"size": "539220676",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-medium/resolve/1951aa1bb411016e15023815d039da9425f1ec5a/ggml-model-q5_0.bin"
]
},
{
"name": "Svenska (WhisperCpp KBLab Large)",
"model_id": "sv_whisper_kblab_large",
"engine": "stt_whisper",
"lang_id": "sv",
"checksum": "34e3bd28",
"checksum_quick": "e822f64d",
"size": "1081148395",
"urls": [
"https://huggingface.co/KBLab/kb-whisper-large/resolve/33cee585905bd2f817274d5a88e65ce3e8fcedb0/ggml-model-q5_0.bin"
]
}
Make sure the JSON formatting is correct.
After restarting the app, you should be able to download the "KBLab" models.
Wonderful! Thank you. I've changed my models.json file and downloaded the large model. Looking forward to testing!
New version 4.8.0 is out and all KBLab Whisper models are included.
f649df08139b56287b90d620c8f9503de31766b2