MiniGPT-4 icon indicating copy to clipboard operation
MiniGPT-4 copied to clipboard

Mistake in the preparation of vicuna weights (error when loading delta weights)

Open huangzhongzhong opened this issue 1 year ago • 7 comments

I run the script to get the vicuna weights and get the following error:

python -m fastchat.model.apply_delta --base I:\chatgpt\minigpt4\MiniGPT-4\llama-13b-hf --target I:\chatgpt\minigpt4\MiniGPT-4\model --delta I:\chatgpt\minigpt4\MiniGPT-4\vicuna-13b-delta-v0

image

huangzhongzhong avatar Apr 19 '23 12:04 huangzhongzhong

'model.layers.18.mlp.gate_proj.weight',` 'model.layers.13.mlp.down_proj.weight', 'model.layers.18.self_attn.q_proj.weight', 'model.layers.39.self_attn.o_proj.weight', 'model.layers.17.mlp.up_proj.weight', 'model.layers.24.self_attn.q_proj.weight', 'model.layers.2.post_attention_layernorm.weight', 'model.layers.17.mlp.down_proj.weight', 'model.layers.27.mlp.down_proj.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\huang.conda\envs\minigpt4\lib\runpy.py:197 in _run_module_as_main │ │ │ │ 194 │ main_globals = sys.modules["main"].dict │ │ 195 │ if alter_argv: │ │ 196 │ │ sys.argv[0] = mod_spec.origin │ │ ❱ 197 │ return _run_code(code, main_globals, None, │ │ 198 │ │ │ │ │ "main", mod_spec) │ │ 199 │ │ 200 def run_module(mod_name, init_globals=None, │ │ │ │ C:\Users\huang.conda\envs\minigpt4\lib\runpy.py:87 in _run_code │ │ │ │ 84 │ │ │ │ │ loader = loader, │ │ 85 │ │ │ │ │ package = pkg_name, │ │ 86 │ │ │ │ │ spec = mod_spec) │ │ ❱ 87 │ exec(code, run_globals) │ │ 88 │ return run_globals │ │ 89 │ │ 90 def _run_module_code(code, init_globals=None, │ │ │ │ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\fastchat\model\apply_delta.py:153 in │ │ │ │ │ │ 150 │ if args.low_cpu_mem: │ │ 151 │ │ apply_delta_low_cpu_mem(args.base_model_path, args.target_model_path, args.delta │ │ 152 │ else: │ │ ❱ 153 │ │ apply_delta(args.base_model_path, args.target_model_path, args.delta_path) │ │ 154 │ │ │ │ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\fastchat\model\apply_delta.py:124 in │ │ apply_delta │ │ │ │ 121 │ print(f"Loading the base model from {base_model_path}") │ │ 122 │ base = AutoModelForCausalLM.from_pretrained( │ │ 123 │ │ base_model_path, torch_dtype=torch.float16, low_cpu_mem_usage=True) │ │ ❱ 124 │ base_tokenizer = AutoTokenizer.from_pretrained( │ │ 125 │ │ base_model_path, use_fast=False) │ │ 126 │ │ │ 127 │ print(f"Loading the delta from {delta_path}") │ │ │ │ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\transformers\models\auto\tokenization_auto │ │ .py:689 in from_pretrained │ │ │ │ 686 │ │ │ │ tokenizer_class = tokenizer_class_from_name(tokenizer_class_candidate) │ │ 687 │ │ │ │ │ 688 │ │ │ if tokenizer_class is None: │ │ ❱ 689 │ │ │ │ raise ValueError( │ │ 690 │ │ │ │ │ f"Tokenizer class {tokenizer_class_candidate} does not exist or is n │ │ 691 │ │ │ │ ) │ │ 692 │ │ │ return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *input │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. (minigpt4) PS I:\chatgpt\minigpt4\MiniGPT-4>

huangzhongzhong avatar Apr 19 '23 12:04 huangzhongzhong

Change "tokenizer_class": "LLaMATokenizer" in llama-13b-hf/tokenizer_config.json into "tokenizer_class": "LlamaTokenizer". It worked for me~

gch8295322 avatar Apr 19 '23 13:04 gch8295322

I have observed that a few seconds before the error occurred, the memory usage suddenly spiked to 60GB out of my total 64GB memory. I suspect this issue might be related to the memory consumption. Could you please provide some guidance or suggestions on how to handle this situation? Thank you in advance. @gch8295322 image image

huangzhongzhong avatar Apr 19 '23 15:04 huangzhongzhong

I have ever seen this issue in here, see that if this can help you?

gch8295322 avatar Apr 19 '23 15:04 gch8295322

Dear @gch8295322

Thank you for your help earlier. I have prepared the model, but I am still encountering the "TypeError: argument of type 'WindowsPath' is not iterable" issue. I noticed that this problem is also being discussed in https://github.com/Vision-CAIR/MiniGPT-4/issues/28. I would like to ask if there is a solution to this issue at the moment? Once again, thank you for your assistance.

Best regards

image image

huangzhongzhong avatar Apr 19 '23 17:04 huangzhongzhong

Change "tokenizer_class": "LLaMATokenizer" in llama-13b-hf/tokenizer_config.json into "tokenizer_class": "LlamaTokenizer". It worked for me~

I still met this issue and I chacked it's "Llama" (in Windows terminal) image image

Wenbobobo avatar Apr 20 '23 03:04 Wenbobobo

@Wenbobobo that's wield, I changed the class name and it works, maybe you should just reboot the terminal? image

LikeGiver avatar Apr 22 '23 12:04 LikeGiver