llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1641 llama.cpp issues
Sort by recently updated
recently updated
newest added

llama code block include view_as_real: https://github.com/facebookresearch/llama/blob/main/llama/model.py#L68 how to convert-pth-to-ggml.py handle this part of weight

question
need more info

Edit: Most of the below is now outdated. This PR aims to do two things. -Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode -Better...

enhancement
generation quality

In `convert-pth-to-ggml.py`, `dir_model` is something like `models/7B` or `models/7B/`. `tokenizer.model` is expected under model's parent dir. When `dir_model` is a symlink, `f"{dir_model}/../tokenizer.model"` would not be found. Let's use the model's...

bug

Just several minor cleanup. 1. Mac (Intel) related: * `$(UNAME_M)` shows "x86-64". * `shell sysctl -n hw.optional.arm64` outputs an error that should be ignored. * Add additional comment on `-framework...

As per https://github.com/ggerganov/llama.cpp/blob/da5303c1ea68aa19db829c634f1e10d08d409680/main.cpp#L1066 the EOS flag in interactive mode simply causes `is_interacting` to switch on, and so it serves as a way to end the current series of tokens and...

enhancement
question

I have seen that support has been added in the master branch to the alpaca model, I have included the model in the docker scripts. Now you can try like...

enhancement
build

Would be cool to be able to lean on the neural engine. Even if it wasn't much faster, it'd still be more energy efficient I believe.

129c7d1e (#20) added a repetition penalty that prevent the model to run into loops. Here are a few suggestions for possible enhancements: * One issue with the interactive mode is...

enhancement
generation quality

Hello, I've tried out the Aplaca model but after a while there comes an error I believe stating: "zsh: segmentation fault ./main -m ./models/alpaca/ggml-alpaca-7b-q4.bin --color -f -ins". Thanks. Code: ./main...

hardware

After the PR #252, all base models need to be converted new. For me, this is a big breaking change. The LoRa and/or Alpaca fine-tuned models are not compatible anymore....

bug
model