llama-cpp-python
llama-cpp-python copied to clipboard
Python bindings for llama.cpp
Hello, Thank you for all the hard work on this project. I wanted to check if this project is deprecated or if it will continue to be supported. It's been...
# Prerequisites Please answer the following questions for yourself before submitting an issue. - [X] I am running the latest code. Development is very rapid so there are no tagged...
In `llama.cpp`, `--n-predict` option is used to set the number of tokens to predict when generating text/ I don't find the binding for that in [doc](https://llama-cpp-python.readthedocs.io/en/latest/).
Hi Team, Thanks for the project, Can you please help in upgrading the version with latest llama cpp ? Ref https://github.com/ggml-org/llama.cpp/issues/12091
# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...
### Testing Done - [x] Build `musa_simple` docker image locally -> pass - [x] Run `musa_simple` container to serve `llama3.2_1b_q8_0.gguf` -> pass ```bash ❯ docker run --net=host --cap-add SYS_RESOURCE -e...
OS: Windows 10 x64 Python 3.11 x64 pandas 2.2.3 llama-cpp-python 0.3.6 ```python import pandas as pd from llama_cpp import Llama Llama(model_path='models/user-bge-m3-q4_k_m.gguf') ``` OSError: exception: access violation reading 0x0000000000000000  Solution:...
Remove redundant type hint, which was caused by typo or smth like that. Those type hints completely equals, so that does not make any sense.
This PR upgrades the `chatml-function-calling` chat handler with support for streaming tool use and fixes #1883, #1869, and #1756, among other improvements. Changes: 1. General: a. ✨ If no system...
Body: Is your feature request related to a problem? Please describe. I'm trying to install llama-cpp-python on Python 3.13.2, but I am facing multiple compilation errors, specifically related to: std::chrono::system_clock...