llama.cpp issues

llama.exe doesn't handle relative file paths in Windows correctly

9

Please include the `ggml-model-q4_0.bin` model to actually run the code: ``` % make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -t...

jjyuhub

bug

model

convert-pth-to-ggml.py failed with RuntimeError

3

Hi there, I downloaded my LLaMa weights through bit-torrent, and tried to convert the 7B model to ggml FP16 format: ``` $python convert-pth-to-ggml.py models/7B/ 1 normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty....

KevinXuxuxu

Illegal hardware instruction in quantize step

2

* Ran into this error on a Macbook Pro M1 ``` ./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2 [1] 18452 illegal hardware instruction ./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2 ``` * What I've tried: *...

tunglee35

Reduce model loading time

Hello! I noticed that the model loader is not using buffered IO, so I added a piece of code for buffering. I measured the loading time only for llama 7B...

maekawatoshiki

Too slow on m2 MBA 16gb SSD 512GB

3

Hi, First of all, thanks for the tremendous work! I just wanted to ask that compared to your demo, when I run the same input sentence, the speed difference is...

effortprogrammer

need more info

Nix flake

14

This pull request adds a simple [Nix Flake](https://nixos.wiki/wiki/Flakes) for building and distributing the binaries of this repository in a combined package. The `main` binary can be executed like this (assuming...

niklaskorz

Port to Visual C++.

1

- Combined nmake/Unix Makefile. - _alloca instead of variable size array. - Do not do math on void*, could cast to char*, but in this case, move the uint8_t* cast....

jaykrell