Andrei comments

Results 177 comments of


                                            Andrei

Anyone built with vulkan yet?

@Ph0rk0z give it a shot again with v0.2.36 the cmake build was fixed in https://github.com/ggerganov/llama.cpp/pull/5182

Support for arm64 wheels and CPU Features

@Smartappli wow thank you so much!

Anyone built with vulkan yet?

@userbox020 are you using cmake to build llama.cpp?

Multimodal Llama3 Support

Check out #1147 it should be merged soon. The only caveat here is that you'll need use the llava example in llama.cpp to extract the image encoder as well when...

Models with multiple chat templates

@CISC do you mind posting a gguf that uses this right now. Yeah I think we can do even more simple and not introduce any new parameters just use the...

Models with multiple chat templates

@CISC good point, let's prefix these dynamically loaded chat templates with `chat_template` so `chat_template.rag` or `chat_template.tool_use` for the cohere model.

Add command-r support like llamacpp has

Hey sorry for the late reply here, the model should work as of the version 0.2.57 pushed a couple of days ago. One issue though is that the gguf files...

https://github.com/abetlen/llama-cpp-python/issues/1342#issuecomment-2054099460 I'll paste my comment here, and maybe we can open a new discussion, basically I'm concerned about the size of releases ballooning with the number of prebuilt wheel variants....

LLaMA cpp python server: IPV6 support

Hey @Smartappli let me consider this, technically you should be able to do by just launching the app directly from the cli via hypercorn correct?

Workflow update

Hey @Smartappli will review soon.