llama.cpp issues

convert-pth-to-ggml.py how to handle torch.view_as_complex

2

llama code block include view_as_real: https://github.com/facebookresearch/llama/blob/main/llama/model.py#L68 how to convert-pth-to-ggml.py handle this part of weight

haolongzhangm

question

need more info

Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode

41

Edit: Most of the below is now outdated. This PR aims to do two things. -Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode -Better...

rabidcopy

enhancement

generation quality

Fixed tokenizer.model not found error when model dir is symlink

In `convert-pth-to-ggml.py`, `dir_model` is something like `models/7B` or `models/7B/`. `tokenizer.model` is expected under model's parent dir. When `dir_model` is a symlink, `f"{dir_model}/../tokenizer.model"` would not be found. Let's use the model's...

mqy

bug

Makefile: slightly cleanup for Mac Intel; replace './main -h' with echo.

Just several minor cleanup. 1. Mac (Intel) related: * `$(UNAME_M)` shows "x86-64". * `shell sysctl -n hw.optional.arm64` outputs an error that should be ignored. * Add additional comment on `-framework...

mqy

Is the --ignore-eos flag redundant?

8

As per https://github.com/ggerganov/llama.cpp/blob/da5303c1ea68aa19db829c634f1e10d08d409680/main.cpp#L1066 the EOS flag in interactive mode simply causes `is_interacting` to switch on, and so it serves as a way to end the current series of tokens and...

tjohnman

enhancement

question

add alpaca support into docker scripts

2

I have seen that support has been added in the master branch to the alpaca model, I have included the model in the docker scripts. Now you can try like...

bernatvadell

enhancement

build

Neural Engine Support

Would be cool to be able to lean on the neural engine. Even if it wasn't much faster, it'd still be more energy efficient I believe.

BrianSemiglia

Improving the repetition penalty

2

129c7d1e (#20) added a repetition penalty that prevent the model to run into loops. Here are a few suggestions for possible enhancements: * One issue with the interactive mode is...

Piezoid

enhancement

generation quality

segmentation fault Alpaca

33

Hello, I've tried out the Aplaca model but after a while there comes an error I believe stating: "zsh: segmentation fault ./main -m ./models/alpaca/ggml-alpaca-7b-q4.bin --color -f -ins". Thanks. Code: ./main...

sussyboiiii

hardware

Breaking change of models since PR #252

24

After the PR #252, all base models need to be converted new. For me, this is a big breaking change. The LoRa and/or Alpaca fine-tuned models are not compatible anymore....

PriNova

bug

model

llama.cpp
llama.cpp copied to clipboard

Metadata

convert-pth-to-ggml.py how to handle torch.view_as_complex

Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode

Fixed tokenizer.model not found error when model dir is symlink

Makefile: slightly cleanup for Mac Intel; replace './main -h' with echo.

Is the --ignore-eos flag redundant?

add alpaca support into docker scripts

Neural Engine Support

Improving the repetition penalty

segmentation fault Alpaca

Breaking change of models since PR #252

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard