llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1637 llama.cpp issues
Sort by recently updated
recently updated
newest added

I have been experimenting with q4_1 quantisation (since [some preliminary results](https://nolanoorg.substack.com/p/int-4-llama-is-not-enough-int-3-and) suggest it shold perform better), and noticed that something about the pipeline for the 13B parameter model is broken...

bug

See this issue: https://github.com/facebookresearch/llama/pull/73

Following on to the "Store preprocessed prompts", it would be good to be able to take in a text file with a generic prompt & flags to start a chatbot...

enhancement

Hey, I know someone already posted a similar issue that has already been closed, but I ran into the same thing. On windows 10 and cloned just yesterday

duplicate
need more info

This is for issue #91. Treat this as a first draft. There are definitely some thing that need to be changed and will be changed shortly. I have not benchmarked....

Adds a parameter called context size (-c for short) that allows taking the context size from the user's input. Defaults to the same hardcoded 512.

Fixes the color messing up the terminal when the program exits by printing an ANSI_COLOR_RESET. Includes it in the SIGINT handler too.

I'm not fully familiar with this codebase, so pardon if I'm wrong. My first attempt to modify the code was to expand hardcoded context window of 512 to 4096 but...

enhancement

When converting the model + tokenizer, use the vocabulary size returned by the tokenizer rather than assuming 32000. There are ways that special tokens or other new tokens could be...