lollms-webui ERROR - failed to load model from ./models/gpt4all-lora-quantized-ggml.bin

Current Behavior

The default model file (gpt4all-lora-quantized-ggml.bin) already exists. Do you want to replace it? Press B to download it with a browser (faster). [Y,N,B]?N

Skipping download of model file... Cleaning tmp folder Virtual environment created and packages installed successfully. Launching application... Checking discussions database... [2023-04-18 10:11:49,423] {model.py:73} INFO - Loading model ... llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ... llama_model_load: invalid model file './models/gpt4all-lora-quantized-ggml.bin' (bad magic) [2023-04-18 10:11:49,424] {model.py:75} ERROR - failed to load model from ./models/gpt4all-lora-quantized-ggml.bin

Steps to Reproduce

run webui.bat

Screenshots

Apr 18 '23 02:04 Datou

What are your system specs? OS? CPU? RAM? Did you download model correctly? I mean maybe its corrupted, try downloading using browser, and then copy it to /models/ folder.

Apr 18 '23 05:04 andzejsp

Win11, 9900K, 32G.

The file was downloaded completely without any issues. I tried downloading it again using the browser, but the file size and error message were the same.

Apr 18 '23 06:04 Datou

CPU seems to support AVX2..

Im running this in ubuntu VM, seems to load just fine.

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 4017.70 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 5809.78 MB (+ 2052.00 MB per state)
llama_model_load: loading tensors from './models/gpt4all-lora-quantized-ggml.bin'
llama_model_load: model size =  4017.27 MB / num tensors = 291
llama_init_from_file: kv self size  =  512.00 MB
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 09:28:50,988] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mm.jj.ss:9600
[2023-04-18 09:28:50,988] {_internal.py:224} INFO - Press CTRL+C to quit

Well i did a git pull right now on VM, and i cant load it anymore aswell

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
./models/gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x6e756f46 want 0x67676a74])
        you most likely need to regenerate your ggml files
        the benefit is you'll get 10-100x faster load times
        see https://github.com/ggerganov/llama.cpp/issues/91
        use convert-pth-to-ggml.py to regenerate from original pth
        use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 09:35:03,952] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mm.jj.ss:9600
[2023-04-18 09:35:03,952] {_internal.py:224} INFO - Press CTRL+C to quit

@ParisNeo something is borked again. Maybe pyllamacpp, ... my hate towards python keeps growing...

Apr 18 '23 06:04 andzejsp

You got "bad magic" too.

Apr 18 '23 06:04 Datou

You got "bad magic" too.

its most likely that the model loading script was updated because this UI relies and depends on other packages/repos, so if they get updated it will take some time for main dev of this repo to look through the code and fix it. He on vacation right now. So just hang tight. it will be fixed eventually.

Apr 18 '23 06:04 andzejsp

@Datou Hi, try this model, it works, the original model is messed up, idk why.

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/ggml-vicuna-13b-4bit-rev1.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 5120
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 40
llama_model_load: n_layer = 40
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 13824
llama_model_load: n_parts = 2
llama_model_load: type    = 2
llama_model_load: ggml map size = 7759.84 MB
llama_model_load: ggml ctx size = 101.25 KB
llama_model_load: mem required  = 9807.93 MB (+ 3216.00 MB per state)
llama_model_load: loading tensors from './models/ggml-vicuna-13b-4bit-rev1.bin'
llama_model_load: model size =  7759.40 MB / num tensors = 363
llama_init_from_file: kv self size  =  800.00 MB
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 12:36:03,700] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mama-mia.juu:9600
[2023-04-18 12:36:03,700] {_internal.py:224} INFO - Press CTRL+C to quit

Apr 18 '23 09:04 andzejsp

For me it loaded after i pulled the newest changes from git and redownloaded gpt4all-lora-quantized-ggml.bin model

Apr 19 '23 06:04 andzejsp

Sorry guyes I have a very slow connection these days and I lost the connection yesterday. It should work now. Please if the problem is solved make sure to close the issue.

Apr 19 '23 07:04 ParisNeo

lollms-webui lollms-webui copied to clipboard

ERROR - failed to load model from ./models/gpt4all-lora-quantized-ggml.bin

Current Behavior

Steps to Reproduce

Screenshots

lollms-webui
lollms-webui copied to clipboard