Thomas Germer
Thomas Germer
> Is the lecture online? It is not, unfortunately. But you could ask if you can have the slides: https://dbs.cs.hhu.de/mitarbeiter.php?id=ludhim
I just noticed that I had an old and a new version of the slides. The new version included full citations:      
What is your issue?
I think there was some issue with literature where `m` was used differently depending on author. Not sure though if this was the exact issue. The derivation went something like...
> right now the workaround is to use the new `/apply-template` endpoint in llama-server, added in a recent commit. It's explained here: https://github.com/ggerganov/llama.cpp/tree/master/examples/server#post-apply-template-apply-chat-template-to-a-conversation Great! With this new `/apply-template` endpoint, we...
> The feature already exists in the form of custom [GBNF grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md)! Great! It works! ```python import requests url = "http://localhost:8080/v1/chat/completions" def prefix_using_grammar(): prefix = "```go\nfunc quacksort" data = {...
@ggerganov Could you please reopen this issue? [The grammar-workaround](https://github.com/ggml-org/llama.cpp/issues/11536#issuecomment-2643444612) works, but a more efficient solution is possible.
> this is solved by [#13174](https://github.com/ggml-org/llama.cpp/pull/13174) ~~Do you have an example how to use this? I can only see an example for `/apply-template`.~~ EDIT: It seems like assistant answers are...
Above the equation, it says: "Note that $\beta_\text{max}$ is defined as a small value". In addition, $\beta_t < \beta_\text{max}$ ("∵" means "because"). Therefore, $\beta_t$ is also a negligibly small value....
This issue is caused by Numba and has to be fixed there eventually. * https://github.com/numba/numba/issues/5520 * https://github.com/numba/numba/issues/5275 Until that happens, it might be possible to hide the warning by setting...