llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1654 llama.cpp issues
Sort by recently updated
recently updated
newest added

- Building specific archs separately to get maximum performance, smallest package size & shortest built times possible (compare [a build for 7.5+8.0](https://download.ochafik.com/llama.cpp/llama-cpp-master-cuda-12.2-cap-7.5_8.0.zip) vs. [just 7.5](https://download.ochafik.com/llama.cpp/llama-cpp-master-cuda-12.2-cap-7.5.zip) for instance: `libggml-cuda.so` is almost...

devops
ggml

* enabled optional cmdline argument '-o' on tts to specify an output filename. It defaults to 'output.wav'. * program now returns ENOENT in case of file write failure

examples

### Name and Version /build/bin/llama-quantize /mnt/data/model/Moonlight-16B-A3B-Instruct/Moonlight-16B-A3B-Instruct-BF16.gguf /mnt/data/model/Moonlight-16B-A3B-Instruct/Moonlight-16B-A3B-Instruct-Q4_K_M.gguf Q4_K_M ### Operating systems Linux ### GGML backends CUDA ### Hardware RTX 4090D ### Models Moonlight-16B-A3B-Instruct ### Problem description & steps to reproduce...

bug-unconfirmed

### Name and Version llama-cli.exe --version version: 4735 (73e2ed3c) built with MSVC 19.42.34436.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? llama-server...

bug-unconfirmed