alpaca.cpp
alpaca.cpp copied to clipboard
13B causes seg fault likely due to not enough RAM - Error message needed
I am using the fully up to date alpaca repo with the 13B model. I downloaded the 13B model from the magnet link in the README. This is what I've run
make chat
./chat -m ../ggml-alpaca-13b-q4.bin
It outputs this:
main: seed = 1679542559
llama_model_load: loading model from '../ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
Segmentation fault
I think I have enough RAM (14 gb), is this just what it does if you don't have enough RAM?
It's working for me
main: seed = 1679547522
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
llama_model_load: memory_size = 3200.00 MB, n_mem = 81920
llama_model_load: loading model part 1/1 from 'ggml-alpaca-13b-q4.bin'
llama_model_load: ............................................. done
llama_model_load: model size = 7759.39 MB / num tensors = 363
system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
== Running in chat mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMA.
- If you want to submit another line, end your input in '\'.
I am wondering if your file might be different from mine. Can you check the md5sum or sha1sum to see if we have the same ggml file?
# md5sum ggml-alpaca-13b-q4.bin
66f3554e700bd06104a4a5753e5f3b5b ggml-alpaca-13b-q4.bin
# sha1sum ggml-alpaca-13b-q4.bin
9655ef2cc5056bd64ff6954d8f225cf4cbc8ad16 ggml-alpaca-13b-q4.bin
I am using the fully up to date alpaca repo with the 13B model. I downloaded the 13B model from the magnet link in the README. This is what I've run
make chat ./chat -m ../ggml-alpaca-13b-q4.bin
It outputs this:
main: seed = 1679542559 llama_model_load: loading model from '../ggml-alpaca-13b-q4.bin' - please wait ... llama_model_load: ggml ctx size = 10959.49 MB Segmentation fault
I think I have enough RAM (14 gb), is this just what it does if you don't have enough RAM?
are you using wsl? i used to have the same problem, wsl limit the ram used. u need to manually changed the memory allocation, and close any apps used.
I think this is because I don't have enough ram. This really needs an error message. Also mentioned in this closed issue https://github.com/antimatter15/alpaca.cpp/issues/45
This should add a better error handling when memory buffer allocation fails. Currently the result of thememory allocation is assumed, it would always be successful, which leads to unexpected error like these segmentation faults. This PR should at least add a better error handling and err message https://github.com/antimatter15/alpaca.cpp/pull/142
fyi @james1236