llama.cpp
llama.cpp copied to clipboard
LLM inference in C/C++
Refactored git/file handling into specialized classes to support arbitrary input file support. For now it will fall back to JSONL if sqlite fails, but there is also support for (multiple)...
### Feature Description I do not know if it is possible, but it may be nice to add draft model in more place. If I am not wrong only the...
### Name and Version ``` $ bin/llama-mtmd-cli --version version: 5343 (62d4250e) built with cc (GCC) 15.1.1 20250425 (Red Hat 15.1.1-1) for x86_64-redhat-linux ``` ### Operating systems Linux ### GGML backends...
Possibly supersede #16813. This PR adds support to run concurrent CUDA streams on single GPU setups. At the moment this only targets the Q, K, V branch. I feel this...
### Git commit Marks-MacBook-Air:llama.cpp mjgs$ git rev-parse HEAD aa3ee0eb0b80efca126cedf9bcb4fb5864b46ce3 ### Operating systems Mac ### GGML backends Vulkan ### Problem description & steps to reproduce I'm trying to compile llama.cpp with...
### Name and Version ilintar@LinuksowaJaskinia:/mnt/win/k/models/unsloth/granite-4.0-h-small-GGUF$ llama-cli --version load_backend: loaded BLAS backend from /devel/tools/llama.cpp/build/bin/libggml-blas.so register_backend: registered backend BLAS (1 devices) register_device: registered device BLAS (OpenBLAS) ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no...
### Name and Version Why would anyone implement the syntax error checking of the escaped json inside the llm response in a way that does not work? What was the...
### Name and Version vulkan: version: 6719 (aa4711d3) built with cc (GCC) 15.2.1 20250813 for x86_64-pc-linux-gnu sycl: version: 6719 (aa4711d3) built with Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205) for x86_64-unknown-linux-gnu...
This PR builds on and supersedes https://github.com/ggml-org/llama.cpp/pull/15826 from @ngxson. * Adds `__EMSCRIPTEN__` preprocessors conditionals in places necessary for compilation (this included some OS specific things in `common/` * Adds flags...