llama.cpp
llama.cpp copied to clipboard
Unhandled exception: _Xlength_error("string too long")
Use cmake to create the vc++ project ,and debug in vs2022. python convert-pth-to-ggml.py models/7B/ 1 done. quantize.exe .\models\7B\ggml-model-f16.bin .\models\7B\ggml-model-q4_0.bin 2 done. llama -m .\models\7B\ggml-model-q4_0.bin -t 8 -n 128
main: seed = 1678771218 llama_model_load: loading model from '.\models\7B\ggml-model-q4_0.bin' - please wait ... llama_model_load: n_vocab = 32000 llama_model_load: n_ctx = 512 llama_model_load: n_embd = 4096 llama_model_load: n_mult = 256 llama_model_load: n_head = 32 llama_model_load: n_layer = 32 llama_model_load: n_rot = 128 llama_model_load: f16 = 2 llama_model_load: n_ff = 11008 llama_model_load: n_parts = 1 llama_model_load: ggml ctx size = 4529.34 MB llama_model_load: memory_size = 512.00 MB, n_mem = 16384 llama_model_load: loading model part 1/1 from '.\models\7B\ggml-model-q4_0.bin' llama_model_load:
Release file: llama.exe
When i use Debug:
There are "Unhandled exception" in xstring. also in 13B.
Unhandled exception at 0x00007FF8E757051C in llama.exe: Microsoft C++ exception: std::length_error at memory location 0x0000006E15CFCA80.
IN
[[noreturn]] inline void _Xlen_string() {
_Xlength_error("string too long");
}
OS: win11 22H2 22621.1265
models md5 is OK
length is -858993460
An int32 can only hold up to 2GB of data so those variables probably need to be int64_t
since the model files are >4GB
@icewm : have you modified the model generation to fit your change? reading 8 bytes instead of 4 isn't going to magically work otherwise. I suggest you update to latest master and re-generate your model files. Try to get sha256sum
for Windows and check the input files, as per the README. If you still get the error, please post your generated model file sizes and checksums.
I found out that's AVX2 problem.when i use E5-2660V2 will get this error. Change to E5-2660V3 it's OK.