llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1641 llama.cpp issues
Sort by recently updated
recently updated
newest added

Trying to convert "chavinlo/alpaca-native" alpaca native model's (https://huggingface.co/chavinlo/alpaca-native) weights to ggml but got this error - Processing part 0 Processing variable: model.embed_tokens.weight with shape: torch.Size([32001, 4096]) and type: torch.float32 Processing...

need more info

first thanks for the wonderful works so far !!! i manged to compile it in Linux and windows but i have a problem with android. i have A52 6 GB...

need more info
android

Hello I'm trying to replicate the process, using 7B on a Raspberry Pi 4 with 8GB of RAM. I'm running the latest Raspberry Pi OS 64-bit, and all of the...

need more info

Not sure why this happens, I am on the latest commit and I am up-to-date on everything I did some tests and it seems like it breaks after 500~ tokens...

need more info

Here's a PR to convert a model written in a GGML format back to Torch checkpoint for HuggingFace/Pytorch consumption/training/finetuning. Mentioned in issue https://github.com/ggerganov/llama.cpp/issues/359 Also included the ability to use HF's...

enhancement

This allows llama.cpp to be called directly from Swift! First add `https://github.com/ggerganov/llama.cpp` to your `Package.swift` or Xcode project, selecting either this branch or `master` (once the PR is merged). Here’s...

enhancement
build

Currently, in [Q4_0](https://github.com/ggerganov/ggml/pull/27) quantization we choose the scaling factor for each 32 group of weights as `abs(max(x_i))/7`. It is easy to see that this is suboptimal. Consider quantization of the...

help wanted
good first issue
research 🔬

`__FMA__` and `__F16C__` are defined in GCC and Clang `__FMA__` and `__F16C__` are not defined in MSVC, however they are implied with AVX2/AVX512 https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-160 Thus, enable FMA and F16C in...

bug
performance

# Current Behavior `./examples/chatLLaMa`, After about 30-round talks, program quite with `Segmentation fault: 11`. I did another try, input last question, but can't reproduce. # Environment and Context * Physical...

bug
duplicate
model

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ X] I am running the latest code. Development is very rapid so there are no...

bug
model