llama-cpp-python
llama-cpp-python copied to clipboard
Python bindings for llama.cpp
- [x] #489 - [x] #490 - [x] #491 - [x] #488 - [x] #492 - [x] #494 - [x] #493 - [ ] #771 - [x] #675
This PR updates llama_cpp.py so that it matches the [llama.h](https://github.com/ggml-org/llama.cpp/blob/master/include/llama.h) API changes introduced in the commits: https://github.com/ggml-org/llama.cpp/commit/e0dbec0bc6cd4b6230cda7a6ed1e9dac08d1600b https://github.com/ggml-org/llama.cpp/commit/8fcb563613e20a04dd9791f0a9b8a41086428c09 https://github.com/ggml-org/llama.cpp/commit/00d53800e00bb22a26bf710fa6bd1150e412cc1d https://github.com/ggml-org/llama.cpp/commit/dd373dd3bf81eced3e711fb7cb49123a6105933e https://github.com/ggml-org/llama.cpp/commit/b3de7cac732e7dc0e10e1bb07502a500b0ee9022 https://github.com/ggml-org/llama.cpp/commit/2c3f8b850a4a6cff0f5dda2135c03fc81d33ed8b https://github.com/ggml-org/llama.cpp/commit/e0e912f49b3195ef9d0c51378629ba03c9b972da I couldn't find any example on how...
Hi! 👋 I was testing the Server API with Moondream 2 as the model, and the [documented `chat_format`](https://github.com/abetlen/llama-cpp-python/blob/37eb5f0a4c2a8706b89ead1406b1577c4602cdec/README.md#L504), `moondream2`, threw an error: ```plain Error code: 500 - {'error': {'message': "Invalid...
I updated my version because some DeepSeek models were not working loading; after updating, they started loading, but only on CPU. I tried with other older models on my system...
In the method load_shared_library the winmode specification is incorrect. The actual code is: ``` cdll_args["winmode"] = ctypes.RTLD_GLOBAL ``` ctypes.RTLD_GLOBAL is not a value for winmode but only for mode argument....
# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ ] I am running the latest code. Development is very rapid so there are no...
I was curious if it was possible to swap out Microsoft's code completion engine with an open source one. After following the [docs](https://llama-cpp-python.readthedocs.io/en/latest/server/#guides) it appears that the copilot has changed...
I'm using CentOS 7 (glibc 2.17) with both CUDA 11.8 and 12.4. Up until version 0.3.7, I was able to install llama-cpp-python with either CUDA version. However, starting from 0.3.8,...
Hey everyone, and especially @abetlen , Hope you’re doing well! I’m reaching out to see if you could help me add support for `Gemma3 multimodal`. If you could walk me...
Support for the Qwen2-VL and MiniCPM-o models would be nice. They already have have been merged into the llava subproject of llama.cpp.