whisper.cpp
whisper.cpp copied to clipboard
Passing parameters like -print-colors or -ml 1 to web version
Hello, @ggerganov! Impressive work, thanks!
Wondering if it is possible to add a --print-colors property for the web version. I was trying to figure it out by myself but had no luck.
Small update: I managed to change the default properties for print_progress and print_special and they seem working fine. Which is great.
For some reason, max_len 1 is not working. So wondering what I am doing wrong 🤔
And print_colors not working at all since there is no handler in whisper.cpp only in main.cpp. So I am wondering if it is possible to move that functionality inside whisper.cpp?
PR: https://github.com/ggerganov/whisper.cpp/pull/454
@ggerganov hi, if you can provide me some hints on how to do it, I am more than happy start working on it in my PR
@dkryaklin
The color coding logic cannot be part of the whisper.cpp library. It has to stay in the user code.
The idea is for the user to choose whatever coloring they might want to do based on the token probabilities.
To achieve this in the WASM example, you have to extend the emscripten.cpp bridge to provide the text segments and token probabilities to the JS layer. When you have that, you can use some HTML/CSS to render the transcribed text with colors in the web-page.
It's not super trivial to implement, but you don't need to modify whisper.h and whisper.cpp for sure.
Only the emscripten.cpp and index-tmpl.html file can give you what you need.
@ggerganov And what about max_len=1, how can this be set?
I applied PR https://github.com/ggerganov/whisper.cpp/pull/454 but no luck. Thanks.
@dkryaklin Did you manage to get max_len=1 working for per-word level timestamps?
Related: https://github.com/ggerganov/whisper.cpp/issues/460
@broccolihighkicks not really, I am going to focus on the node addon version since it is more promising in terms of speed.
@broccolihighkicks @dkryaklin
I implemented a function here to achieve the desired result:
https://github.com/ggerganov/whisper.cpp/commit/965c0123934feb47ac309dc0df0077d191b836c9
the output that you saw before was just logging, not the actual result of whisper_full.
That was the reason the parameters didn't apply.
I hope this helps!
great work @spirobel - just came back here wanting to share basically the same! :D
small add-on for the --print-colors (getting the probability) of each segment:
for (int i = 0; i < n_segments; ++i) {
...
const int n_tokens = whisper_full_n_tokens(g_contexts[index], i);
for (int j = 0; j < n_tokens; ++j) {
const auto token = whisper_full_get_token_data(g_contexts[index], i, j);
prob += token.p;
++prob_n;
}
// Heads-up that there can be more than 1 token even with `--ml 1`
printf("token.p: %f\n", prob/prob_n);
...
}