llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1637 llama.cpp issues
Sort by recently updated
recently updated
newest added

The goal of this refactor is allow reusing the model execution while using streams other than stdin/stdout for interaction. In my case, I'd like to implement a simple TCP server...

The following is a proposed template for creating new issues. If people think the tone could be improved, I'd appreciate feedback! ___ # Prerequisites Please answer the following questions for...

documentation
good first issue

I was tinkering with the code and made the following change in `line 977, main.cpp` (as it seemed wrong to me): *from* ```C if (embd.size() > params.n_batch) { break; }...

bug
generation quality

I am trying to output just the sentence embedding for a given input, instead of any new generated text. I think this should be rather straightforward but figured someone more...

question

This appear to solve https://github.com/ggerganov/llama.cpp/issues/153 where error of `ggml_new_tensor_impl: not enough space in the context's memory pool` is thrown in interactive mode, if using a larger context size. At least...

bug
high priority

I do not expect this to be merged, but I figured it might help others. Although, I don't know if this is the right place. This logs information to a...

enhancement

### Discussed in https://github.com/ggerganov/llama.cpp/discussions/234 Originally posted by **ShouNichi** March 17, 2023 When `git checkout 84d9015` and `make`, there will be no output (only the model loading message) in termux. `git...

bug
need more info

In the PR that was resolved (#132), the action defined to publish the packages used the user and token of the author of the commit in master. In this case,...

enhancement
good first issue
build

It would be great to start doing this kind of quantitative analysis of `ggml`-based inference: https://bellard.org/ts_server/ It looks like Fabrice evaluates the models using something called LM Evaluation Harness: https://github.com/EleutherAI/lm-evaluation-harness...

enhancement
high priority
generation quality
research 🔬