alpaca.cpp
alpaca.cpp copied to clipboard
chat.exe will instantly exit with no text or error msg
chat.exe will produce blank line with no text and will exit. On windows 10 compiled with cmake
Please help.
D:\ALPACA\alpaca-win>chat.exe
main: seed = 1679458006
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size = 2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 4 / 4 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
D:\ALPACA\alpaca-win>
Same problem. I downloaded compiled exe from Releases. It shows blank line after model loading and exit. I also tried run same exe file on another PC(with newer CPU, but same OS win10 ) - there is no such error.
I have same problem
F:\lama\alpaca-win>.\chat.exe -m ggml-alpaca-7b-q4.bin
main: seed = 1679476791
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size = 2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 4 / 4 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
F:\lama\alpaca-win>.\chat.exe -m ggml-alpaca-7b-q4.bin
Mine quit after accept my question:
PS E:\repo\langchain-alpaca\dist\binary> ./chat.exe --model "e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin" --threads 6
main: seed = 1679490011
llama_model_load: loading model from 'e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size = 2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 6 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
== Running in chat mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMA.
- If you want to submit another line, end your input in '\'.
> Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. harrison went to harvard ankush went to princeton Question Where did harrison go to college Helpful Answer
I have enough memory for it, not a OOM.
I have exactly the same issue. Compiled with cmake and VS2019. I get the info about the parameters. Waits like 10 seconds and then it is over without producing any error
I have exactly the same issue. Compiled with cmake and VS2019. I get the info about the parameters. Waits like 10 seconds and then it is over without producing any error
I guess I found part of the problem, not necessarily the real cause, and not even close to a solution, but this may help at least the maintainers On function llama_eval this call:
ggml_graph_compute (ctx0, &gf);
is the one that never finishes and the program aborts
Same issue, with even less text than the others.
E:\AI-Chat\alpaca-win>chat -m ggml-alpaca-13b-q4.bin
main: seed = 1679516224
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
E:\AI-Chat\alpaca-win>
clicking on chat.exe will not load anything.
Me too. Windows10, 32gb ram. "chat.exe has stopped working". Same for 7b and 13b
I don't even get to ask it a question
Same here
D:\StableDiffusion\Alpaca>chat.exe -i -m ggml-alpaca-13b-q4.bin -t 1
main: seed = 1679581051
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
llama_model_load: memory_size = 3200.00 MB, n_mem = 81920
llama_model_load: loading model part 1/1 from 'ggml-alpaca-13b-q4.bin'
llama_model_load: ............................................. done
llama_model_load: model size = 7759.39 MB / num tensors = 363
system_info: n_threads = 1 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
16 GB RAM of which about 12 GB are being used during "startup"
Building with MINGW might help
-
Install MSYS2 (www.msys2.org)
-
Place sources into C:\msys64\home\USERNAME\alpaca.cpp and apply this patch https://github.com/antimatter15/alpaca.cpp/pull/84
-
Inside UCRT64 terminal run: pacman -S mingw-w64-ucrt-x86_64-gcc pacman -S make cd alpaca.cpp make
chat.exe should appear there C:\msys64\home\USERNAME\alpaca.cpp\chat.exe
Building with MINGW might help
- Install MSYS2 (www.msys2.org)
- Place sources into C:\msys64\home\USERNAME\alpaca.cpp and apply this patch Add support for building on native Windows via MINGW. #84
- Inside UCRT64 terminal run: pacman -S mingw-w64-ucrt-x86_64-gcc pacman -S make cd alpaca.cpp make
chat.exe should appear there C:\msys64\home\USERNAME\alpaca.cpp\chat.exe
Doesn't work for me
When I "make"
D:/msys64/ucrt64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_
inline' '_mm256_cvtph_ps': target specific option mismatch
52 | _mm256_cvtph_ps (__m128i __A)
| ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:911:33: note: in definition of macro 'GGML_F32Cx8_LOAD'
911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
| ^~~~~~~~~~~~~~~
ggml.c:1274:21: note: in expansion of macro 'GGML_F16_VEC_LOAD'
1274 | ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j);
| ^~~~~~~~~~~~~~~~~
D:/msys64/ucrt64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_
inline' '_mm256_cvtph_ps': target specific option mismatch
52 | _mm256_cvtph_ps (__m128i __A)
etc
If I don't include the patch, it compiles, but chat.exe gives me an illegal instruction
Try this version https://github.com/SiemensSchuckert/alpaca.cpp
Try this version https://github.com/SiemensSchuckert/alpaca.cpp
Thanks that works fine. yaaaay
same probelm
@SiemensSchuckert that worked thanks a lot :)