Georgi Gerganov

Results 420 comments of Georgi Gerganov

Very basic tests by changing certain formats and F32 - > F16 casts at hot spots indicate that this might not be a viable approach for improving the performance. Will...

Someone figure out with the Windows build is failing and merge

> @bengarney New lines do add tokens > > ~@j-f1 Spaces do not add tokens~ > > The one limitation I see here is that you cannot intentionally add trailing...

Should we merge now or wait for someone to test on Windows? @SuajCarrot maybe keep the `.sh` for now and add a comment that it is deprecated. We will remove...

It was confirmed in #285 that it works on Windows, so no need to do it

The `Floating point exception (core dumped)` is strange. Try getting the latest `master`, then `make clean` + `make stream` and try again. The larger models are quite heavy for real-time...

There was a bug in the `stream` example: a6dbd9188b13378dc36e2c669b9a22e17b4201d1 I think this fixes both the garbage results + the floating point exception

@meakbiyik Thanks for reporting this. I think I see what is the issue - here we incorrectly override the `no_context` parameter so the `--keep_context` argument does nothing because of this:...

This is very likely related to the new temperature fallback strategy that is enabled by default. For real-time streaming, it is recommended to disable it like this: https://github.com/ggerganov/whisper.cpp/blob/c9aeb3367632d4ba824db49245c884ba28d200af/examples/stream/stream.cpp#L617-L620

The problem with the fallback is that when it triggers it increases the decoding time significantly. I think for real-time purposes this is not desired.