llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

Python bindings for llama.cpp

Results 424 llama-cpp-python issues
Sort by recently updated
recently updated
newest added

```pip install llama-cpp-python ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python which use PEP 517 and cannot be installed directly``` it...

I encountered the following problem when using llama cpp python on Mac. The model's answer is completely unreasonable. The red box contains the question and answer. ![Image](https://github.com/user-attachments/assets/7ec01319-ff12-4de8-aab6-77e701de2fc3) The configuration is...

**This is just a question on the blessed path here** I am wondering if it is possible to build a docker image including `llama-cpp-python` on a non-GPU host which targets...

Good morning all! I am running dual RTX NVIDIA 3090 at x8x8 using nvlink, 7950x3d, 128GB RAM, only CPU is being used: Configuration of my python script: ``` # ---...

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

(CUDA125-py311) D:\software\llama-cpp-python>set CMAKE_CXX_COMPILER="C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.43.34808\bin\Hostx64\x64\cl.exe" (CUDA125-py311) D:\software\llama-cpp-python>set FORCE_CMAKE=1 && set CMAKE_ARGS=-DGGML_CUDA=on && pip install --upgrade --no-cache-dir --force-reinstall -v --prefer-binary llama-cpp-python Using pip 25.0 from D:\software\minicondapy311\envs\CUDA125-py311\Lib\site-packages\pip (python 3.11) Looking in...

I am experiencing issues while trying to launch the deepseek-v3 model with a 671B Q2_K_L quantized version on 4 x A100 (80GB) GPUs. The model fails to load, and I...

Hi there. I see there is a metal build for v0.3.5. Would you please releasze a cuda version? Best regards

(CUDA125-py312) PS E:\llama-cpp-python\build> cmake -G "Visual Studio 17 2022" -A x64 ` >> -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.5" ` >> -DCMAKE_CUDA_COMPILER="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.5/bin/nvcc.exe" ` >> -DCMAKE_CUDA_ARCHITECTURES="89" ` >>...

I build package on cuda, so llama running on GPU. But CLIP part still on CPU. How to fix it? Thanks. clip_model_load: loaded meta data with 19 key-value pairs and...