llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

I encountered the same issue(#10556 ) in Ascend310B1 as well. ``` root@orangepiaipro-20t:/data/llama.cpp# cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=release -- Warning: ccache not found - consider installing it for faster compilation or...

Adds example docs for converting a granite vision model, which is essentially a llava next model with multiple feature layers using siglip for the visual encoder, and a granite language...

examples

### Name and Version llama-cli --version ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon Graphics (RADV VEGA20) (radv) | uma: 0 | fp16: 1 | warp size: 64...

bug-unconfirmed

- [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [x] Medium - [ ] High

examples
python

This pull request aims to integrate the SIMD instruction set via `vecintrin.h` into llama.cpp on the s390x platform. Currently the SIMD instruction set is included in the following `ggml_vec_dot` functions:...

ggml

Fixes https://github.com/ggml-org/llama.cpp/issues/11946 . I added an option `GGML_CUDA_NO_FA` that is used for CUDA, HIP, and MUSA. Two more general questions for compile options: * Do we have guidelines regarding whether...

Nvidia GPU
ggml

### Name and Version $./llama-cli --version version: 3680 (947538ac) built with cc (Debian 14.2.0-16) 14.2.0 for x86_64-linux-gnu ### Operating systems Linux ### GGML backends CPU ### Hardware Intel Celeron 1007U...

bug-unconfirmed

### Name and Version ``` version: 4754 (de8b5a36) built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.2.0 ``` but also reproducing on the current main branch ### Operating systems Mac...

bug-unconfirmed