llama.cpp issues

convert-pth-to-ggml.py error with "Got unsupported ScalarType BFloat16"

1

Trying to convert "chavinlo/alpaca-native" alpaca native model's (https://huggingface.co/chavinlo/alpaca-native) weights to ggml but got this error - Processing part 0 Processing variable: model.embed_tokens.weight with shape: torch.Size([32001, 4096]) and type: torch.float32 Processing...

austinchau

need more info

illegal instructions error on Android

26

first thanks for the wonderful works so far !!! i manged to compile it in Linux and windows but i have a problem with android. i have A52 6 GB...

aicoat

need more info

android

"Illegal Instruction" error when converting 7B model to ggml FP16 format (Raspberry Pi 4, 8GB, Raspberry Pi OS, 64-bit)

1

Hello I'm trying to replicate the process, using 7B on a Raspberry Pi 4 with 8GB of RAM. I'm running the latest Raspberry Pi OS 64-bit, and all of the...

lesp

need more info

GGML_ASSERT: ggml.c:4014: false zsh: abort ./main -m ./models/65B/ggml-model-q4_0.bin -t 16 -n 256 --repeat_penalty 1.0

5

Not sure why this happens, I am on the latest commit and I am up-to-date on everything I did some tests and it seems like it breaks after 500~ tokens...

nazthelizard122

need more info

Converting GGML back to Torch checkpoint for HuggingFace/Pytorch consumption/training/finetuning

5

Here's a PR to convert a model written in a GGML format back to Torch checkpoint for HuggingFace/Pytorch consumption/training/finetuning. Mentioned in issue https://github.com/ggerganov/llama.cpp/issues/359 Also included the ability to use HF's...

ductai199x

enhancement

Add a Package.swift for SwiftPM support

2

This allows llama.cpp to be called directly from Swift! First add `https://github.com/ggerganov/llama.cpp` to your `Package.swift` or Xcode project, selecting either this branch or `master` (once the PR is merged). Here’s...

j-f1

enhancement

build

Investigate alternative approach for Q4 quantization

53

Currently, in [Q4_0](https://github.com/ggerganov/ggml/pull/27) quantization we choose the scaling factor for each 32 group of weights as `abs(max(x_i))/7`. It is easy to see that this is suboptimal. Consider quantization of the...

ggerganov

help wanted

good first issue

research 🔬

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

11

`__FMA__` and `__F16C__` are defined in GCC and Clang `__FMA__` and `__F16C__` are not defined in MSVC, however they are implied with AVX2/AVX512 https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-160 Thus, enable FMA and F16C in...

anzz1

bug

performance

[mqy] ./examples/chatLLaMa: line 53: 33476 Segmentation fault: 11

9

# Current Behavior `./examples/chatLLaMa`, After about 30-round talks, program quite with `Segmentation fault: 11`. I did another try, input last question, but can't reproduce. # Environment and Context * Physical...

mqy

bug

duplicate

model

Alpaca 7B faults on both macOS arm64 and Linux ppc64le

3

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ X] I am running the latest code. Development is very rapid so there are no...

classilla

bug

model

llama.cpp
llama.cpp copied to clipboard

Metadata

convert-pth-to-ggml.py error with "Got unsupported ScalarType BFloat16"

illegal instructions error on Android

"Illegal Instruction" error when converting 7B model to ggml FP16 format (Raspberry Pi 4, 8GB, Raspberry Pi OS, 64-bit)

GGML_ASSERT: ggml.c:4014: false zsh: abort ./main -m ./models/65B/ggml-model-q4_0.bin -t 16 -n 256 --repeat_penalty 1.0

Converting GGML back to Torch checkpoint for HuggingFace/Pytorch consumption/training/finetuning

Add a Package.swift for SwiftPM support

Investigate alternative approach for Q4 quantization

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

[mqy] ./examples/chatLLaMa: line 53: 33476 Segmentation fault: 11

Alpaca 7B faults on both macOS arm64 and Linux ppc64le

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard