llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1641 llama.cpp issues
Sort by recently updated
recently updated
newest added

Based on: https://github.com/qwopqwop200/GPTQ-for-LLaMa Current status: Something is busted. The output starts out decent, but quickly degrades into gibberish. This doesn't happen with either the original GPTQ-for-LLaMa using the same weights,...

enhancement
model

This fixes bug #292 as suggested [here](https://github.com/ggerganov/llama.cpp/issues/292#issuecomment-1476318351).

bug

If it is not necessary sorted maps, change std::map to std::unordered_map std::unordered_map is a hash table so it should be faster than std::map when storing many items. std::map can be...

enhancement

In interactive mode, every time the model has to respond to user input it has an increasingly reduced token budget, eventually generating only a few words before stopping. The token...

enhancement

- On older versions function will silently fail without any ill effects - Only used when params.use_color==true ( --color ) - No windows.h dependency

bug

Some moving around of ANSI color code emissions in recent patches has left us in a situation where RESET codes were getting defensively emitted after every token, resulting in multibyte...

bug

rebase error

bug

Add OpenBSD support.

enhancement
build

bit of refactoring per https://github.com/ggerganov/llama.cpp/pull/252

enhancement

NOTE: I am seeing different outputs when running with these changes. They seem of equal quality, but this isn't something I observed when first testing this out on alpaca.cpp. It's...

enhancement
performance