private-gpt Illegal hardware instruction

trafficstars

Hello,

Thanks again for this nice project! After the ingest process, when I try to run python3 privateGPT.py I've an error:

llama.cpp: loading model from ./models/ggml-model-q4_0.bin
llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format     = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113748.20 KB
llama_model_load_internal: mem required  = 5809.33 MB (+ 2052.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size  =  512.00 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
Using embedded DuckDB with persistence: data will be stored in: db
[1]    18314 illegal hardware instruction  python3 privateGPT.py

Hardware: MacBook Pro M1 Software: macOS Ventura

Regards, Hisxo

May 11 '23 18:05 hisxo

Same for me but M1 Pro.

% python3 privateGPT.py llama.cpp: loading model from ./models/ggml-model-q4_0.bin llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 512 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 2 (mostly Q4_0) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 4113748.20 KB llama_model_load_internal: mem required = 5809.33 MB (+ 2052.00 MB per state) ................................................................................................... . llama_init_from_file: kv self size = 512.00 MB AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | Using embedded DuckDB with persistence: data will be stored in: db zsh: illegal hardware instruction python3 privateGPT.py

May 11 '23 19:05 linuxatico

I think you need AVX and/or F16C: AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |

"No, M1 is not based on the x86 architecture so it can, in no way shape or form get AVX, because AVX is defined only for x86_64 architecture."

I think your CPU isn't supported.

May 11 '23 19:05 alxspiker

Try this fork which uses qdrant instead of Chroma. I think Chroma relies on DuckDB which might be the issue. Using qdrant I dont get the using embedded duckdb message. Let me know.

May 11 '23 19:05 alxspiker

Hello @alxspiker

Same error with CASALIOY :(

May 11 '23 19:05 hisxo

Try these steps: https://gist.github.com/cedrickchee/e8d4cb0c4b1df6cc47ce8b18457ebde0

May 11 '23 20:05 alxspiker

I am getting the same 'illegal hardwarware instruction' error on my M1 Pro after running privateGPT.py Hardware: MacBook Pro M1 Software: macOS Monterey Python version: 3.10.11

@alxspiker The model provided in README of this repo is the same ggml quantized format as given in the link you have provided. In that case, it should ideally work on a M1, right?

May 16 '23 19:05 adityakadrekar16

@alxspiker -- This needs to be built from source on M1, right? https://github.com/su77ungr/CASALIOY

May 23 '23 02:05 brianjking

private-gpt private-gpt copied to clipboard

Illegal hardware instruction

private-gpt
private-gpt copied to clipboard