llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

Idea from: https://github.com/ggerganov/llama.cpp/issues/23#issuecomment-1465308592 We can add a `--cache_prompt` flag that if added will dump the computed KV caches of the prompt processing to the disk in a file with name...

enhancement
help wanted
good first issue
high priority
🦙.

I cloned the GitHub repository and ran the make command but was unable to get the cpp files to compile successfully. Any help or suggestion would be appreciated. Terminal output:...

`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if...

Everything's OK until this step python3 convert-pth-to-ggml.py models/7B/ 1 {'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': 32000} n_parts = 1 Processing part 0 Killed models/7B/ggml-model-f16.bin isn't...

I'm not an expert on Licenses BUT, If you attribute Facebook in the README and description, you essentially admit/imply that this repo is a modification of their repo. Facebook's repo...

First of all, thank you for the effort of the entire community. The work they do is impressive. I'm going to try to do my bit by dockerizing this client...

Now we have a shiny new cmake frontend, can we: - eliminate the makefile? - document the Cmake build instructions? As far as I know, users might use the make...

enhancement

Both the `ggml-model-q4_0` and `ggml-model-f16` produce a garbage output on my M1 Air 8GB, using the 7B LLaMA model. I've seen the quantized model having problems but I doubt the...

Add a banner with a C++ llama logo in the `README.md` ![banner](https://user-images.githubusercontent.com/4641499/225103864-afc1483a-677d-440a-b71e-9c5842c12268.png) Preview here: [https://github.com/leszekhanusz/llama.cpp/tree/readme_llama_banner](https://github.com/leszekhanusz/llama.cpp/tree/readme_llama_banner) Current discussion in issue #105 The text can be changed if needed, suggestions welcome. The...

I found that the model of LLaMA-7B shut down unexpectedly when the number of tokens in prompt reaches some value, this value is approximately to be 500 this cannot be...