Georgi Gerganov comments

Results 420 comments of


Georgi Gerganov

Speedup option with variable rate between 1x and 2x

When I implemented the x2 speed-up option I did a small research on tempo speed-up algorithms and it looks like the general solution is not very trivial to implement because...

Add process-specific timings

@abitofevrything Can you verify that everything is good on Windows? After that I will merge it

C-style API/Python bindings not working on Windows

The `whisper_full_params` struct has to be mapped precisely. I immediately see that the order of the `n_threads` and `n_max_text_ctx` is wrong: ![image](https://user-images.githubusercontent.com/1991296/213513010-ea58f1fc-b52d-4984-ac90-fb7a7886cd55.png)

CSV format export trims spaces

@alex-bacart The `--max-len 1` means to output maximum 1 token per text segment. The word " Ponzi" consists of 2 tokens: ` Pon` and `zi` and therefore it is being...

Prevent word splitting when using max-len option

@mightymatth Thanks for this contribution - I think this is very useful! Although it is OK to merge like this, I will likely change it to have a bool flag...

whisper.cpp 1.20 produces different inference than OpenAI whisper and with higher WER

Thanks for the data point! How do I calculate WER scores?

Bulk repetition

Hi, thanks for the detailed steps - this helps a lot. After debugging with [WHISPER_DEBUG](https://github.com/ggerganov/whisper.cpp/blob/b2083c5d02db9a1e6dbb3d58254fd65ebfff4b5d/whisper.cpp#L91) enabled I can see immediately that in this case, the entropy-based check for repetition didn't...

Georgi Gerganov