Georgi Gerganov

Results 420 comments of Georgi Gerganov

@thomasantony We want to have a C-style API in `llama.h`. We cannot expose C++ constructs For now, leave it like this and let me apply the necessary changes on top...

Yes, it is of interest. The tree-based decoding is already fully supported. The speculative streams and multi-stream attention layers should be possible to support, but I would need an actual...

Given these results, I believe the fine-tuned model does not output timestamp tokens for some reason. To confirm that, can you provide the output of the same run after adding...

I see the `transcribe (50359)` token is being decoded a lot of times for some reason. This is not supposed to happen. I just pushed a change to `master` to...

We still see the `50359` token - this is unexpected. I guess best option is to provide instructions for downloading the model so I can test it locally.

On a similar topic, recently I found this project: https://github.com/xenova/transformers.js It has a very efficient inference of Whisper tiny using WASM. They seem to be using something called ONNX Runtime....

> Also: maybe it's a good idea to make it so that `-nt` in [main.cpp](https://github.com/ggerganov/whisper.cpp/blob/master/examples/main/main.cpp?rgh-link-date=2024-01-07T19%3A07%3A23Z#L161) not only does not print timestamps, _but also does not compute them_: > > `wparams.no_timestamps...

Hi @patrickvonplaten - congrats on the release! I believe I have successfully added initial support for the distilled models in the following PR: https://github.com/ggerganov/whisper.cpp/pull/1424 However, I'm worried that for optimal...

Thanks for the links. Will probably look into chunking after I make the `v1.5.0` release of `whisper.cpp`.