llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

When compiling, openblas was enabled, but it seems that there is no acceleration effect during inference. Compared to not enabling openblas, it only increases the memory usage. What is the...

Change MAX GPU+CPU from 16 to 64 *Make sure to read the [contributing guidelines](https://github.com/ggml-org/llama.cpp/blob/master/CONTRIBUTING.md) before submitting a PR*

ggml

Motivation: `ggml_is_view_op` is a useful API. ### Use case 1 It is used by `test-backend-ops.cpp`. ### Use case 2 https://github.com/ggml-org/llama.cpp/blob/73e2ed3ce3492d3ed70193dd09ae8aa44779651d/src/llama.cpp#L8178-L8179 Let's say `cur` is a view operation and its source...

testing
ggml

### Background Description Ref: https://github.com/ggerganov/llama.cpp/pull/7553 , required for supporting future vision models (https://github.com/ggerganov/llama.cpp/issues/8010) I initially planned to make a proposal PR for this, but turns out it's quite more complicated...

### Prerequisites - [X] I am running the latest code. Mention the version if possible as well. - [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md). - [X] I searched using keywords...

enhancement
stale

Contributors, What's the point of truncating to a strict 32-bit size_t value and comparing to int64_t? Is this a legacy cast-code when it was rewritten to 64-bit types, or is...

bug-unconfirmed
stale

## Problem Description The current test command for the llama-server object in the project has a duplicated file extension, which prevents the test script from correctly locating the test file....

examples
server

### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords...

enhancement

### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords...

enhancement

### Git commit 73e2ed3ce3492d3ed70193dd09ae8aa44779651d ### Operating systems Linux ### GGML backends CUDA ### Problem description & steps to reproduce I am trying to host model using llama-server. Successfully built llama.cpp...

bug-unconfirmed