Andrei
Andrei
@Ph0rk0z give it a shot again with v0.2.36 the cmake build was fixed in https://github.com/ggerganov/llama.cpp/pull/5182
@Smartappli wow thank you so much!
@userbox020 are you using cmake to build llama.cpp?
Check out #1147 it should be merged soon. The only caveat here is that you'll need use the llava example in llama.cpp to extract the image encoder as well when...
@CISC do you mind posting a gguf that uses this right now. Yeah I think we can do even more simple and not introduce any new parameters just use the...
@CISC good point, let's prefix these dynamically loaded chat templates with `chat_template` so `chat_template.rag` or `chat_template.tool_use` for the cohere model.
Hey sorry for the late reply here, the model should work as of the version 0.2.57 pushed a couple of days ago. One issue though is that the gguf files...
https://github.com/abetlen/llama-cpp-python/issues/1342#issuecomment-2054099460 I'll paste my comment here, and maybe we can open a new discussion, basically I'm concerned about the size of releases ballooning with the number of prebuilt wheel variants....
Hey @Smartappli let me consider this, technically you should be able to do by just launching the app directly from the cli via hypercorn correct?
Hey @Smartappli will review soon.