llama.cpp issues

Misc. bug: Docker Image llama-quantize Segmentation fault

3

### Name and Version root@f7545b6b4f65:/app# ./llama-cli --version load_backend: loaded CPU backend from ./libggml-cpu-alderlake.so version: 4460 (ba8a1f9c) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu ### Operating systems Linux, Other? (Please...

aria3ppp

bug-unconfirmed

stale

Compile bug: No rule to make target '/usr/lib/librt.a'

1

### Git commit git rev-parse HEAD c5ede3849fc021174862f9c0bf8273808d8f0d39 ### Operating systems Linux ### GGML backends CUDA ### Problem description & steps to reproduce I want to build llama.cpp from source in...

jiafatom

bug-unconfirmed

stale

Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention

2

* Add a new option `GGML_HIP_ROCWMMA_FATTN` and defaults to OFF * Check for rocWMMA header availability when `GGML_HIP_ROCWMMA_FATTN` is enabled * Define `FP16_MMA_AVAILABLE` when `GGML_HIP_ROCWMMA_FATTN` is enabled and target is...

hjc4869

Nvidia GPU

ggml

Eval bug: GGML_ASSERT(hparams.n_embd_head_k % ggml_blck_size(type_k) == 0) failed

### Name and Version ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 3 CUDA devices: Device 0: Tesla P40, compute capability 6.1, VMM: yes Device 1: Tesla P40, compute capability...

AbdullahMPrograms

bug-unconfirmed

server webui easy config selection

1

Add to webui a quick access for custom configurations (including prompt). ![Firefox_Screenshot_2025-02-22T19-19-44 044Z](https://github.com/user-attachments/assets/8de540fb-5326-4aab-b9d7-82ac8702064e) As asked by @xydac in an old PR, the chat prompts from https://github.com/f/awesome-chatgpt-prompts has been loaded in...

poulphunter

demo

examples

server

CUDA: compress-mode size

cuda 12.8 added the [option](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#compress-mode-default-size-speed-balance-none-compress-mode) to specify stronger compression for binaries. I ran some tests in CI with the [new ubuntu 12.8 docker image](https://hub.docker.com/r/nvidia/cuda/): ## `89-real` arch In this scenario,...

Green-Sky

Nvidia GPU

ggml

Misc. bug: Web-UI now unusably slow due to animations

### Name and Version version: 4747 (c5d91a74) built with cc (Debian 11.3.0-12) 11.3.0 for x86_64-linux-gnu ### Problem description & steps to reproduce Webui unusably slow over network due to forced...

clort81

bug-unconfirmed

llama.cpp
llama.cpp copied to clipboard

Metadata

Misc. bug: Docker Image llama-quantize Segmentation fault

Compile bug: No rule to make target '/usr/lib/librt.a'

Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention