llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

Python bindings for llama.cpp

Results 424 llama-cpp-python issues
Sort by recently updated
recently updated
newest added

Hello, Thank you for all the hard work on this project. I wanted to check if this project is deprecated or if it will continue to be supported. It's been...

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [X] I am running the latest code. Development is very rapid so there are no tagged...

In `llama.cpp`, `--n-predict` option is used to set the number of tokens to predict when generating text/ I don't find the binding for that in [doc](https://llama-cpp-python.readthedocs.io/en/latest/).

Hi Team, Thanks for the project, Can you please help in upgrading the version with latest llama cpp ? Ref https://github.com/ggml-org/llama.cpp/issues/12091

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

### Testing Done - [x] Build `musa_simple` docker image locally -> pass - [x] Run `musa_simple` container to serve `llama3.2_1b_q8_0.gguf` -> pass ```bash ❯ docker run --net=host --cap-add SYS_RESOURCE -e...

OS: Windows 10 x64 Python 3.11 x64 pandas 2.2.3 llama-cpp-python 0.3.6 ```python import pandas as pd from llama_cpp import Llama Llama(model_path='models/user-bge-m3-q4_k_m.gguf') ``` OSError: exception: access violation reading 0x0000000000000000 ![Image](https://github.com/user-attachments/assets/ee466188-574b-4253-925b-833abea8b732) Solution:...

Remove redundant type hint, which was caused by typo or smth like that. Those type hints completely equals, so that does not make any sense.

This PR upgrades the `chatml-function-calling` chat handler with support for streaming tool use and fixes #1883, #1869, and #1756, among other improvements. Changes: 1. General: a. ✨ If no system...

Body: Is your feature request related to a problem? Please describe. I'm trying to install llama-cpp-python on Python 3.13.2, but I am facing multiple compilation errors, specifically related to: std::chrono::system_clock...