llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

Python bindings for llama.cpp

Results 424 llama-cpp-python issues
Sort by recently updated
recently updated
newest added

Added gemma3 chat handler, and fixed the image embedding, supports multiple images. Included llamacpp functions and structures: - clip_image_load_from_bytes - clip_image_batch_encode - clip_image_preprocess - clip_image_f32_batch_init - clip_image_f32_batch_free - clip_image_u8_init -...

On my mac, running the following breaks: ```bash pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.5-metal/llama_cpp_python-0.3.5-cp312-cp312-macosx_11_0_arm64.whl ``` ```bash deflate decompression error: invalid stored block lengths ``` I think there is something wrong with the wheel,...

Seems to work, but someone who knows this project better please check the order in `apply_func()`.

I am trying to build llama-cpp-python with CUDA and it is failing. I have tried to use some of the suggestions in here for similar issues and they aren't working...

# Prerequisites - [x] I am running the latest code. - [x] I followed the README.md. - [x] I searched open/closed issues. - [x] This is a reproducible error and...

# Environment Lastest version by 2025.03.26 # Issue I tried to use this library to record logits of my gguf model. But it seemed to be some errors on the...

Safeerchalil:codespace-automatic-barnacle-5gv4qx4j775wf47g _Originally posted by @Safeerchalil in https://github.com/abetlen/llama-cpp-python/issues/1978#issuecomment-2797951584_

pip install llama-cpp-python

Hi, I'm currently facing this `tokenizer_name NotImplementedError` while testing quantized `.gguf`model with `[lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)` I'm having this trouble with --apply_chat_template run command lm_eval --model gguf --model_args base_url=http://127.0.1.1:8080 --tasks gsm8k --output_path result/gsm8k...

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...