llama.cpp
llama.cpp copied to clipboard
LLM inference in C/C++
### Name and Version llama-b4730-bin-win-hip-x64-gfx1030 ### Operating systems Windows ### GGML backends HIP ### Hardware Ryzen 7 8840u ### Models any model ### Problem description & steps to reproduce when...
### Name and Version ./build/bin/Debug/llama-cli --version register_backend: registered backend Metal (1 devices) register_device: registered device Metal (AMD Radeon HD GFX10 Family Unknown Prototype) register_backend: registered backend BLAS (1 devices) register_device:...
See https://github.com/ggerganov/ggml/issues/1025 except I decided to implement the training directly in llama.cpp after all because the GPT-2 GGML example is already pretty complex, would require a significant amount of effort...
### Name and Version .\build\bin\Release\llama-cli.exe --version ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp...
### Name and Version ./llama-server --version load_backend: loaded CPU backend from ./libggml-cpu-haswell.so version: 4457 (ee7136c6) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu ### Operating systems Linux ### GGML backends...
Hello, I'm very new to this repo and just read through the quickstart. I am curious does this repo support qwen model quantized with AWQ which has int4 weights and...
### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md). - [x] I searched using keywords...
This PR is a follow-up to #11769 It implements the following ops for Vulkan: * GGML_OP_ROPE_BACK * GGML_OP_RMS_NORM_BACK * GGML_OP_SILU_BACK * GGML_OP_SOFTMAX_BACK Shaders are mostly copy-pasted from CUDA kernels with...
Before any generation is made by `llama.cpp` server, the `/metrics` endpoint reported a `-nan` value resulting from a divide by zero. This PR ensures that the division by zero never...
### Name and Version - ### Operating systems Linux ### Which llama.cpp modules do you know to be affected? llama-server ### Command line ```shell `podman pull ghcr.io/ggml-org/llama.cpp:server-rocm` ``` ### Problem...