llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

llama.exe doesn't handle relative file paths in Windows correctly

Open jjyuhub opened this issue 1 year ago • 9 comments

Please include the ggml-model-q4_0.bin model to actually run the code:

% make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -t 8 -n 512
I llama.cpp build info: 
I UNAME_S:  Darwin
I UNAME_P:  arm
I UNAME_M:  arm64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 14.0.0 (clang-1400.0.29.202)
I CXX:      Apple clang version 14.0.0 (clang-1400.0.29.202)

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE   -c ggml.c -o ggml.o
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread -c utils.cpp -o utils.o
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread main.cpp ggml.o utils.o -o main  -framework Accelerate
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread quantize.cpp ggml.o utils.o -o quantize  -framework Accelerate
./main -h
usage: ./main [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 128)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --repeat_last_n N     last n tokens to consider for penalize (default: 64)
  --repeat_penalty N    penalize repeat sequence of tokens (default: 1.3)
  --temp N              temperature (default: 0.8)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: models/llama-7B/ggml-model.bin)

main: seed = 1678619388
llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: failed to open './models/7B/ggml-model-q4_0.bin'
main: failed to load model from './models/7B/ggml-model-q4_0.bin'

My pre-signed URL to download the model weights was broken.

jjyuhub avatar Mar 12 '23 11:03 jjyuhub

Windows help me please image

1octopus1 avatar Mar 14 '23 19:03 1octopus1

DId you follow the instructions in the README.md to download, convert and quantize the model? The model is not included in the repo.

gjmulder avatar Mar 14 '23 20:03 gjmulder

DId you follow the instructions in the README.md to download, convert and quantize the model? The model is not included in the repo.

I tried everything .. I did not see a separate instruction for Windows (via CMake) =(

1octopus1 avatar Mar 15 '23 08:03 1octopus1

It is telling you it find the model in ./models/7B. Is the ggml-model-q4_0.bin file in that directory?

gjmulder avatar Mar 15 '23 09:03 gjmulder

I don't use powershell, and I don't know what ./Release/llama.exe is yellow (I assume that means it exists?) but he is using forward slashes, and windows doesn't use those.. so idk if PS has some fancy shit to use the correct slashes or not also does cmake create a Release file just for the .exe or are the models in there too? anyway I am gonna assume that folder doesn't even exist because he's using the wrong slashes.

G2G2G2G avatar Mar 15 '23 09:03 G2G2G2G

Well powershell supports forward slashes just fine, but in windows the path argument to llama.exe is passed verbatim, i.e. its up to llama.exe to handle parsing the relative file path correctly.

sebgod avatar Mar 19 '23 03:03 sebgod

Reopened and corrected the issue title.

gjmulder avatar Mar 19 '23 11:03 gjmulder

Not sure if related. But the ggml-model-q4_0.bin I am getting is only 296kb

There is no error.

C:\llama\models\7B>quantize ggml-model-f16.bin ggml-model-q4_0.bin 2
llama_model_quantize: loading model from 'ggml-model-f16.bin'
llama_model_quantize: n_vocab = 32000
llama_model_quantize: n_ctx   = 512
llama_model_quantize: n_embd  = 4096
llama_model_quantize: n_mult  = 256
llama_model_quantize: n_head  = 32
llama_model_quantize: n_layer = 32
llama_model_quantize: f16     = 1
                           tok_embeddings.weight - [ 4096, 32000], type =    f16
C:\llama\models\7B>

VTSTech avatar Mar 19 '23 14:03 VTSTech

You should check your model file, it's too small. I get this error because i wrong model_name spelling...

zhouhh2017 avatar Mar 21 '23 09:03 zhouhh2017

Check the downloaded files via checksums in SHA256 file. Please reopen if the issue still persists.

prusnak avatar Apr 16 '23 09:04 prusnak