llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1637 llama.cpp issues
Sort by recently updated
recently updated
newest added

failed to tokenize string! system_info: n_threads = 16 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0...

bug

This change modifies the `quantize.sh` script so that it can run properly on different platforms (including the Windows platform in the WSL environment).

bugfix: std::string mesh up vocab. OS: CentOS 7 compiler: gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)

bug

Hi everyone, I took a stab at adding embedding mode, where we print the sentence embedding for the input instead of generating more tokens. If I only add the compute...

enhancement

This builds on my [other PR](https://github.com/ggerganov/llama.cpp/pull/267) to implement a very simple TCP mode. The new mode first loads the model then listens for TCP connections on a port. When a...

enhancement

Add: https://github.com/gyunggyung/OpenMLLM Use: https://github.com/gyunggyung/KoAlpaca.cpp

Resolves https://github.com/ggerganov/llama.cpp/issues/240 WIP This needs to be able to: 1. Configure custom model folders. 2. Adjust settings for running variants of the Alpaca model and make corresponding changes in the...

enhancement
🦙.
model

This is a prototype of computing perplexity over the prompt input. It does so by using `n_ctx - 1` tokens as the input to the model, and computes the softmax...

enhancement
generation quality

After running the command: "python3 convert-pth-to-ggml.py /Users/tanish.shah/llama.cpp/models/7B/ 1" Error with sentencepiece: ``` Traceback (most recent call last): File "/Users/tanish.shah/llama.cpp/convert-pth-to-ggml.py", line 75, in tokenizer = sentencepiece.SentencePieceProcessor(fname_tokenizer) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tanish.shah/llama.cpp/env/lib/python3.11/site-packages/sentencepiece/__init__.py", line 447,...

need more info