llama-cpp-python issues

feat: Add Gemma3 chat handler (#1976)

25

Added gemma3 chat handler, and fixed the image embedding, supports multiple images. Included llamacpp functions and structures: - clip_image_load_from_bytes - clip_image_batch_encode - clip_image_preprocess - clip_image_f32_batch_init - clip_image_f32_batch_free - clip_image_u8_init -...

kossum

Macos Metal Github Release for python 3.12 is broken

1

On my mac, running the following breaks: ```bash pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.5-metal/llama_cpp_python-0.3.5-cp312-cp312-macosx_11_0_arm64.whl ``` ```bash deflate decompression error: invalid stored block lengths ``` I think there is something wrong with the wheel,...

haixuanTao

Add support for XTC and DRY samplers

11

Seems to work, but someone who knows this project better please check the order in `apply_func()`.

zpin

CUDA llama-cpp-python build failed.

3

I am trying to build llama-cpp-python with CUDA and it is failing. I have tried to use some of the suggestions in here for similar issues and they aren't working...

Ado012

TypeError: 'NoneType' object is not callable in del() when exiting

# Prerequisites - [x] I am running the latest code. - [x] I followed the README.md. - [x] I searched open/closed issues. - [x] This is a reproducible error and...

gbutiri

Slow when logits_all=True, inconsistent logprobs and solutions

6

# Environment Lastest version by 2025.03.26 # Issue I tried to use this library to record logits of my gguf model. But it seemed to be some errors on the...

For-rest2005

Safeerchalil:codespace-automatic-barnacle-5gv4qx4j775wf47g

Safeerchalil:codespace-automatic-barnacle-5gv4qx4j775wf47g _Originally posted by @Safeerchalil in https://github.com/abetlen/llama-cpp-python/issues/1978#issuecomment-2797951584_

Safeerchalil

Initial commit

1

pip install llama-cpp-python

Safeerchalil

How to use chat_template with .gguf models ? (tokenizer_name not implemented)

1

Hi, I'm currently facing this `tokenizer_name NotImplementedError` while testing quantized `.gguf`model with `[lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)` I'm having this trouble with --apply_chat_template run command lm_eval --model gguf --model_args base_url=http://127.0.1.1:8080 --tasks gsm8k --output_path result/gsm8k...

Bobchenyx

Running basic example from docs results in `TypeError: 'NoneType' object is not callable`

1

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

nchammas

llama-cpp-python
llama-cpp-python copied to clipboard

Metadata

feat: Add Gemma3 chat handler (#1976)

Macos Metal Github Release for python 3.12 is broken

Add support for XTC and DRY samplers

CUDA llama-cpp-python build failed.

TypeError: 'NoneType' object is not callable in del() when exiting

Slow when logits_all=True, inconsistent logprobs and solutions

Safeerchalil:codespace-automatic-barnacle-5gv4qx4j775wf47g

Initial commit

How to use chat_template with .gguf models ? (tokenizer_name not implemented)

Running basic example from docs results in `TypeError: 'NoneType' object is not callable`

← Metadata

Owner

Metadata

llama-cpp-python llama-cpp-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama-cpp-python
llama-cpp-python copied to clipboard