whisper.net
whisper.net copied to clipboard
Explanation of FluentAPI settings
Hello!
Is there any information which "With~" in the fluent api corresponds to which settings/flags in whisper.cpp?
I'm mostly interested in -ml
flag, which allows for limiting output length per line.
Looks like the WithMaxSegmentLength()
should work the same way as -ml
but I think it does not
Thanks!
Hello @drajvver, Not all the flags in the main example of whisper.cpp have a correlated With~ fluent API, but all whisper.cpp whisper_full_params https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h#L332 have a correlated FluentAPI in whisper.net.
Some of the arguments are just implemented on the client (e.g. diarization): but I added example of this as well: https://github.com/sandrohanea/whisper.net/tree/main/examples/Diarization
For the -ml (--max-len), there are multiple whisper_full_params changes: https://github.com/ggerganov/whisper.cpp/blob/master/examples/main/main.cpp#LL776C1-L779C1
The Whisper.net equivalent of that would be:
.WithTokenTimestamps()
.WithMaxSegmentLength(15)
So I think that it does not work as it should or I'm making some sort of silly mistake. For this: https://www.youtube.com/shorts/g9IYllmOtUc
And settings:
await using var processor = whisperFactory.CreateBuilder()
.WithLanguage("en")
.WithTemperature(0.2f)
.WithTokenTimestamps()
.WithMaxSegmentLength(4)
.WithPrintProgress()
.WithPrintResults()
.WithPrintTimestamps()
.Build();
I get output like this:
[00:00:00.000 --> 00:00:06.140] My friend Julius just moved into his new home and needed to go grab some tools, so he asked me to watch his place. [00:00:06.140 --> 00:00:13.060] I watched his kitchen and found what I thought was his only ramen stash, but I looked to the left and saw another bag of ramen packages. [00:00:13.060 --> 00:00:19.700] I started digging through it to see if any of them sounded good. Then I looked to the right and found even more instant ramen in a box. [00:00:19.700 --> 00:00:28.060] Then I felt the urge to turn around and boom, there's another bag of noodles. I grabbed the super spicy ones and started to quickly make them before Julius got back. [00:00:28.060 --> 00:00:35.900] I felt like I had spent too much time perusing his ramen stash, so I didn't add much to this. Now was it super spicy as advertised? Eh. [00:00:35.900 --> 00:00:40.980] It definitely had a pleasant kick and the noodles were nice and chewy, but spice was probably a 3 out of 10.
It's completely possible that I'm doing something very wrong but I can't see what would that be
It sounds indeed like a bug, but didn't have time to check it yet :(
Hello again @drajvver ,
I tried to reproduce the bug but couldn't (whisper.net was returning the same as whisper.cpp) for tiny model
Can you please try to create some repro zip (including the model), using Whisper.net 1.4.4?
Closing due to inactivity. Feel free to reopen if the issue still happens with Whisper.net > 1.7.0