text-generation-webui Executing server for chat / does not loads

Describe the bug

Bug caussed on the instalation.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Once installed the oobabooga on linux and made the last step: pip install requirements.txt --break-system-packages

We go now to execute: python3 server.py --chat and goes "crash"

Screenshot

No response

Logs

`root@mikel-VMware-Virtual-Platform:/home/mikel/oobabooga_linux/text-generation-webui# python3 server.py --chat
Gradio HTTP request redirected to localhost :)
bin /usr/local/lib/python3.11/dist-packages/bitsandbytes/libbitsandbytes_cpu.so
/usr/local/lib/python3.11/dist-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
Loading legacy-ggml-vicuna-13b-4bit...
llama.cpp weights detected: models/legacy-ggml-vicuna-13b-4bit/ggml-vicuna-13b-4bit-rev1.bin

llama.cpp: loading model from models/legacy-ggml-vicuna-13b-4bit/ggml-vicuna-13b-4bit-rev1.bin
error loading model: unknown (magic, version) combination: 73726576, 206e6f69; is this really a GGML file?
llama_init_from_file: failed to load model
╭───────────────────── Traceback (most recent call last) ─────────────────────╮
│ /home/mikel/oobabooga_linux/text-generation-webui/server.py:914 in <module> │
│                                                                             │
│   911 │   │   update_model_parameters(model_settings, initial=True)  # hija │
│   912 │   │                                                                 │
│   913 │   │   # Load the model                                              │
│ ❱ 914 │   │   shared.model, shared.tokenizer = load_model(shared.model_name │
│   915 │   │   if shared.args.lora:                                          │
│   916 │   │   │   add_lora_to_model(shared.args.lora)                       │
│   917                                                                       │
│                                                                             │
│ /home/mikel/oobabooga_linux/text-generation-webui/modules/models.py:141 in  │
│ load_model                                                                  │
│                                                                             │
│   138 │   │   │   model_file = list(Path(f'{shared.args.model_dir}/{model_n │
│   139 │   │                                                                 │
│   140 │   │   print(f"llama.cpp weights detected: {model_file}\n")          │
│ ❱ 141 │   │   model, tokenizer = LlamaCppModel.from_pretrained(model_file)  │
│   142 │   │   return model, tokenizer                                       │
│   143 │                                                                     │
│   144 │   # Quantized model                                                 │
│                                                                             │
│ /home/mikel/oobabooga_linux/text-generation-webui/modules/llamacpp_model.py │
│ :32 in from_pretrained                                                      │
│                                                                             │
│   29 │   │   │   'use_mmap': not shared.args.no_mmap,                       │
│   30 │   │   │   'use_mlock': shared.args.mlock                             │
│   31 │   │   }                                                              │
│ ❱ 32 │   │   self.model = Llama(**params)                                   │
│   33 │   │   self.model.set_cache(LlamaCache)                               │
│   34 │   │                                                                  │
│   35 │   │   # This is ugly, but the model and the tokenizer are the same o │
│                                                                             │
│ /usr/local/lib/python3.11/dist-packages/llama_cpp/llama.py:148 in __init__  │
│                                                                             │
│   145 │   │   │   self.model_path.encode("utf-8"), self.params              │
│   146 │   │   )                                                             │
│   147 │   │                                                                 │
│ ❱ 148 │   │   assert self.ctx is not None                                   │
│   149 │   │                                                                 │
│   150 │   │   if self.lora_path:                                            │
│   151 │   │   │   if llama_cpp.llama_apply_lora_from_file(                  │
╰─────────────────────────────────────────────────────────────────────────────╯
AssertionError

System Info

Linux Ubuntu 22 runing over VMWare Workstation 15

May 04 '23 08:05 mironkraft

It appears that I am facing an issue with loading the model. The error message indicates that there's a problem with the model file format, which might be due to an incorrect or corrupted file. I will execute next steps:

Verify the model file: Make sure you have downloaded the correct model file and it's not corrupted. You might want to re-download the model file and replace the existing one.
Check model file format: Ensure the model file is in the correct format. In this case, it should be a GGML file. If you have accidentally downloaded a different format, you need to download the correct one.
Update the model loading code: If you have made any changes to the model loading code or the model file path, double-check your changes and ensure the code is pointing to the correct model file.

It seems that I have to re-download the repository of the items in:

Once done, to move to the folder in Linux where it has to be the clonning of the page, remember, it is LARGE FILE SISTEM for GIT check more: https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage

Actually, is is downloading the LFS directly from repository:

I will keep updated untill I solve this problem.

May 04 '23 12:05 mironkraft

This bug was solved following the next steps:

Downloading both files directly on the browser (not with LFS)
Copy and paste int he folder of modules (using cp comand)

Closing this.

May 05 '23 10:05 mironkraft

text-generation-webui text-generation-webui copied to clipboard

Executing server for chat / does not loads

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard