InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

An error is reported when the model is loaded

Open Athicbliss opened this issue 9 months ago • 4 comments

Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /data/workspace/models/InternVL-C hat-V1-5 and are newly initialized: You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. AttributeError: 'NoneType' object has no attribute 'is_loaded'

Athicbliss avatar May 07 '24 02:05 Athicbliss

Hello, thank you for your interest, can you provide more information, such as what command is used to load the model?

czczup avatar May 08 '24 16:05 czczup

@czczup

I encountered same issue.

I downloaded model file from https://modelscope.cn/models/AI-ModelScope/InternVL-Chat-V1-5/files.

I tried to follow the instruction from https://github.com/OpenGVLab/InternVL/blob/main/document/how_to_deploy_a_local_demo.md to run gradio demo.

And I executed following command: python3.10 -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --device auto --worker http://localhost:40000 --model-path /root/onethingai-tmp/models/InternVL-Chat-V1-5

The output from console as following:

Discovered apex.normalization.FusedRMSNorm - will use it instead of LlamaRMSNorm 2024-05-18 23:42:55 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000', controller_address='http://localhost:10000', model_path='/root/onethingai-tmp/models/InternVL-Chat-V1-5', model_base=None, model_name=None, device='auto', multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False) 2024-05-18 23:42:55 | INFO | model_worker | Loading the model InternVL-Chat-V1-5 on worker 8b8837 ... The repository for /root/onethingai-tmp/models/InternVL-Chat-V1-5 contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//root/onethingai-tmp/models/InternVL-Chat-V1-5. You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. You are using a model of type internvl_chat to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/11 [00:00<?, ?it/s] Loading checkpoint shards: 9%|███████▌ | 1/11 [00:00<00:02, 4.16it/s] Loading checkpoint shards: 27%|██████████████████████▋ | 3/11 [00:00<00:00, 8.34it/s] Loading checkpoint shards: 45%|█████████████████████████████████████▋ | 5/11 [00:00<00:00, 10.63it/s] Loading checkpoint shards: 64%|████████████████████████████████████████████████████▊ | 7/11 [00:00<00:00, 11.94it/s] Loading checkpoint shards: 82%|███████████████████████████████████████████████████████████████████▉ | 9/11 [00:00<00:00, 12.76it/s] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 13.20it/s] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 11.51it/s] 2024-05-18 23:49:32 | ERROR | stderr | Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /root/onethingai-tmp/models/InternVL-Chat-V1-5 and are newly initialized: ['embed_tokens.weight', 'layers.0.input_layernorm.weight', 'layers.0.mlp.down_proj.weight', 'layers.0.mlp.gate_proj.weight', 'layers.0.mlp.up_proj.weight', 'layers.0.post_attention_layernorm.weight', 'layers.0.self_attn.k_proj.weight', 'layers.0.self_attn.o_proj.weight', 'layers.0.self_attn.q_proj.weight', 'layers.0.self_attn.v_proj.weight', 'layers.1.input_layernorm.weight', 'layers.1.mlp.down_proj.weight', 'layers.1.mlp.gate_proj.weight', 'layers.1.mlp.up_proj.weight', 'layers.1.post_attention_layernorm.weight', 'layers.1.self_attn.k_proj.weight', 'layers.1.self_attn.o_proj.weight', 'layers.1.self_attn.q_proj.weight', 'layers.1.self_attn.v_proj.weight', 'layers.10.input_layernorm.weight', 'layers.10.mlp.down_proj.weight', 'layers.10.mlp.gate_proj.weight', 'layers.10.mlp.up_proj.weight', 'layers.10.post_attention_layernorm.weight', 'layers.10.self_attn.k_proj.weight', 'layers.10.self_attn.o_proj.weight', 'layers.10.self_attn.q_proj.weight', 'layers.10.self_attn.v_proj.weight', 'layers.11.input_layernorm.weight', 'layers.11.mlp.down_proj.weight', 'layers.11.mlp.gate_proj.weight', 'layers.11.mlp.up_proj.weight', 'layers.11.post_attention_layernorm.weight', 'layers.11.self_attn.k_proj.weight', 'layers.11.self_attn.o_proj.weight', 'layers.11.self_attn.q_proj.weight', 'layers.11.self_attn.v_proj.weight', 'layers.12.input_layernorm.weight', 'layers.12.mlp.down_proj.weight', 'layers.12.mlp.gate_proj.weight', 'layers.12.mlp.up_proj.weight', 'layers.12.post_attention_layernorm.weight', 'layers.12.self_attn.k_proj.weight', 'layers.12.self_attn.o_proj.weight', 'layers.12.self_attn.q_proj.weight', 'layers.12.self_attn.v_proj.weight', 'layers.13.input_layernorm.weight', 'layers.13.mlp.down_proj.weight', 'layers.13.mlp.gate_proj.weight', 'layers.13.mlp.up_proj.weight', 'layers.13.post_attention_layernorm.weight', 'layers.13.self_attn.k_proj.weight', 'layers.13.self_attn.o_proj.weight', 'layers.13.self_attn.q_proj.weight', 'layers.13.self_attn.v_proj.weight', 'layers.14.input_layernorm.weight', 'layers.14.mlp.down_proj.weight', 'layers.14.mlp.gate_proj.weight', 'layers.14.mlp.up_proj.weight', 'layers.14.post_attention_layernorm.weight', 'layers.14.self_attn.k_proj.weight', 'layers.14.self_attn.o_proj.weight', 'layers.14.self_attn.q_proj.weight', 'layers.14.self_attn.v_proj.weight', 'layers.15.input_layernorm.weight', 'layers.15.mlp.down_proj.weight', 'layers.15.mlp.gate_proj.weight', 'layers.15.mlp.up_proj.weight', 'layers.15.post_attention_layernorm.weight', 'layers.15.self_attn.k_proj.weight', 'layers.15.self_attn.o_proj.weight', 'layers.15.self_attn.q_proj.weight', 'layers.15.self_attn.v_proj.weight', 'layers.16.input_layernorm.weight', 'layers.16.mlp.down_proj.weight', 'layers.16.mlp.gate_proj.weight', 'layers.16.mlp.up_proj.weight', 'layers.16.post_attention_layernorm.weight', 'layers.16.self_attn.k_proj.weight', 'layers.16.self_attn.o_proj.weight', 'layers.16.self_attn.q_proj.weight', 'layers.16.self_attn.v_proj.weight', 'layers.17.input_layernorm.weight', 'layers.17.mlp.down_proj.weight', 'layers.17.mlp.gate_proj.weight', 'layers.17.mlp.up_proj.weight', 'layers.17.post_attention_layernorm.weight', 'layers.17.self_attn.k_proj.weight', 'layers.17.self_attn.o_proj.weight', 'layers.17.self_attn.q_proj.weight', 'layers.17.self_attn.v_proj.weight', 'layers.18.input_layernorm.weight', 'layers.18.mlp.down_proj.weight', 'layers.18.mlp.gate_proj.weight', 'layers.18.mlp.up_proj.weight', 'layers.18.post_attention_layernorm.weight', 'layers.18.self_attn.k_proj.weight', 'layers.18.self_attn.o_proj.weight', 'layers.18.self_attn.q_proj.weight', 'layers.18.self_attn.v_proj.weight', 'layers.19.input_layernorm.weight', 'layers.19.mlp.down_proj.weight', 'layers.19.mlp.gate_proj.weight', 'layers.19.mlp.up_proj.weight', 'layers.19.post_attention_layernorm.weight', 'layers.19.self_attn.k_proj.weight', 'layers.19.self_attn.o_proj.weight', 'layers.19.self_attn.q_proj.weight', 'layers.19.self_attn.v_proj.weight', 'layers.2.input_layernorm.weight', 'layers.2.mlp.down_proj.weight', 'layers.2.mlp.gate_proj.weight', 'layers.2.mlp.up_proj.weight', 'layers.2.post_attention_layernorm.weight', 'layers.2.self_attn.k_proj.weight', 'layers.2.self_attn.o_proj.weight', 'layers.2.self_attn.q_proj.weight', 'layers.2.self_attn.v_proj.weight', 'layers.20.input_layernorm.weight', 'layers.20.mlp.down_proj.weight', 'layers.20.mlp.gate_proj.weight', 'layers.20.mlp.up_proj.weight', 'layers.20.post_attention_layernorm.weight', 'layers.20.self_attn.k_proj.weight', 'layers.20.self_attn.o_proj.weight', 'layers.20.self_attn.q_proj.weight', 'layers.20.self_attn.v_proj.weight', 'layers.21.input_layernorm.weight', 'layers.21.mlp.down_proj.weight', 'layers.21.mlp.gate_proj.weight', 'layers.21.mlp.up_proj.weight', 'layers.21.post_attention_layernorm.weight', 'layers.21.self_attn.k_proj.weight', 'layers.21.self_attn.o_proj.weight', 'layers.21.self_attn.q_proj.weight', 'layers.21.self_attn.v_proj.weight', 'layers.22.input_layernorm.weight', 'layers.22.mlp.down_proj.weight', 'layers.22.mlp.gate_proj.weight', 'layers.22.mlp.up_proj.weight', 'layers.22.post_attention_layernorm.weight', 'layers.22.self_attn.k_proj.weight', 'layers.22.self_attn.o_proj.weight', 'layers.22.self_attn.q_proj.weight', 'layers.22.self_attn.v_proj.weight', 'layers.23.input_layernorm.weight', 'layers.23.mlp.down_proj.weight', 'layers.23.mlp.gate_proj.weight', 'layers.23.mlp.up_proj.weight', 'layers.23.post_attention_layernorm.weight', 'layers.23.self_attn.k_proj.weight', 'layers.23.self_attn.o_proj.weight', 'layers.23.self_attn.q_proj.weight', 'layers.23.self_attn.v_proj.weight', 'layers.24.input_layernorm.weight', 'layers.24.mlp.down_proj.weight', 'layers.24.mlp.gate_proj.weight', 'layers.24.mlp.up_proj.weight', 'layers.24.post_attention_layernorm.weight', 'layers.24.self_attn.k_proj.weight', 'layers.24.self_attn.o_proj.weight', 'layers.24.self_attn.q_proj.weight', 'layers.24.self_attn.v_proj.weight', 'layers.25.input_layernorm.weight', 'layers.25.mlp.down_proj.weight', 'layers.25.mlp.gate_proj.weight', 'layers.25.mlp.up_proj.weight', 'layers.25.post_attention_layernorm.weight', 'layers.25.self_attn.k_proj.weight', 'layers.25.self_attn.o_proj.weight', 'layers.25.self_attn.q_proj.weight', 'layers.25.self_attn.v_proj.weight', 'layers.26.input_layernorm.weight', 'layers.26.mlp.down_proj.weight', 'layers.26.mlp.gate_proj.weight', 'layers.26.mlp.up_proj.weight', 'layers.26.post_attention_layernorm.weight', 'layers.26.self_attn.k_proj.weight', 'layers.26.self_attn.o_proj.weight', 'layers.26.self_attn.q_proj.weight', 'layers.26.self_attn.v_proj.weight', 'layers.27.input_layernorm.weight', 'layers.27.mlp.down_proj.weight', 'layers.27.mlp.gate_proj.weight', 'layers.27.mlp.up_proj.weight', 'layers.27.post_attention_layernorm.weight', 'layers.27.self_attn.k_proj.weight', 'layers.27.self_attn.o_proj.weight', 'layers.27.self_attn.q_proj.weight', 'layers.27.self_attn.v_proj.weight', 'layers.28.input_layernorm.weight', 'layers.28.mlp.down_proj.weight', 'layers.28.mlp.gate_proj.weight', 'layers.28.mlp.up_proj.weight', 'layers.28.post_attention_layernorm.weight', 'layers.28.self_attn.k_proj.weight', 'layers.28.self_attn.o_proj.weight', 'layers.28.self_attn.q_proj.weight', 'layers.28.self_attn.v_proj.weight', 'layers.29.input_layernorm.weight', 'layers.29.mlp.down_proj.weight', 'layers.29.mlp.gate_proj.weight', 'layers.29.mlp.up_proj.weight', 'layers.29.post_attention_layernorm.weight', 'layers.29.self_attn.k_proj.weight', 'layers.29.self_attn.o_proj.weight', 'layers.29.self_attn.q_proj.weight', 'layers.29.self_attn.v_proj.weight', 'layers.3.input_layernorm.weight', 'layers.3.mlp.down_proj.weight', 'layers.3.mlp.gate_proj.weight', 'layers.3.mlp.up_proj.weight', 'layers.3.post_attention_layernorm.weight', 'layers.3.self_attn.k_proj.weight', 'layers.3.self_attn.o_proj.weight', 'layers.3.self_attn.q_proj.weight', 'layers.3.self_attn.v_proj.weight', 'layers.30.input_layernorm.weight', 'layers.30.mlp.down_proj.weight', 'layers.30.mlp.gate_proj.weight', 'layers.30.mlp.up_proj.weight', 'layers.30.post_attention_layernorm.weight', 'layers.30.self_attn.k_proj.weight', 'layers.30.self_attn.o_proj.weight', 'layers.30.self_attn.q_proj.weight', 'layers.30.self_attn.v_proj.weight', 'layers.31.input_layernorm.weight', 'layers.31.mlp.down_proj.weight', 'layers.31.mlp.gate_proj.weight', 'layers.31.mlp.up_proj.weight', 'layers.31.post_attention_layernorm.weight', 'layers.31.self_attn.k_proj.weight', 'layers.31.self_attn.o_proj.weight', 'layers.31.self_attn.q_proj.weight', 'layers.31.self_attn.v_proj.weight', 'layers.4.input_layernorm.weight', 'layers.4.mlp.down_proj.weight', 'layers.4.mlp.gate_proj.weight', 'layers.4.mlp.up_proj.weight', 'layers.4.post_attention_layernorm.weight', 'layers.4.self_attn.k_proj.weight', 'layers.4.self_attn.o_proj.weight', 'layers.4.self_attn.q_proj.weight', 'layers.4.self_attn.v_proj.weight', 'layers.5.input_layernorm.weight', 'layers.5.mlp.down_proj.weight', 'layers.5.mlp.gate_proj.weight', 'layers.5.mlp.up_proj.weight', 'layers.5.post_attention_layernorm.weight', 'layers.5.self_attn.k_proj.weight', 'layers.5.self_attn.o_proj.weight', 'layers.5.self_attn.q_proj.weight', 'layers.5.self_attn.v_proj.weight', 'layers.6.input_layernorm.weight', 'layers.6.mlp.down_proj.weight', 'layers.6.mlp.gate_proj.weight', 'layers.6.mlp.up_proj.weight', 'layers.6.post_attention_layernorm.weight', 'layers.6.self_attn.k_proj.weight', 'layers.6.self_attn.o_proj.weight', 'layers.6.self_attn.q_proj.weight', 'layers.6.self_attn.v_proj.weight', 'layers.7.input_layernorm.weight', 'layers.7.mlp.down_proj.weight', 'layers.7.mlp.gate_proj.weight', 'layers.7.mlp.up_proj.weight', 'layers.7.post_attention_layernorm.weight', 'layers.7.self_attn.k_proj.weight', 'layers.7.self_attn.o_proj.weight', 'layers.7.self_attn.q_proj.weight', 'layers.7.self_attn.v_proj.weight', 'layers.8.input_layernorm.weight', 'layers.8.mlp.down_proj.weight', 'layers.8.mlp.gate_proj.weight', 'layers.8.mlp.up_proj.weight', 'layers.8.post_attention_layernorm.weight', 'layers.8.self_attn.k_proj.weight', 'layers.8.self_attn.o_proj.weight', 'layers.8.self_attn.q_proj.weight', 'layers.8.self_attn.v_proj.weight', 'layers.9.input_layernorm.weight', 'layers.9.mlp.down_proj.weight', 'layers.9.mlp.gate_proj.weight', 'layers.9.mlp.up_proj.weight', 'layers.9.post_attention_layernorm.weight', 'layers.9.self_attn.k_proj.weight', 'layers.9.self_attn.o_proj.weight', 'layers.9.self_attn.q_proj.weight', 'layers.9.self_attn.v_proj.weight', 'lm_head.weight', 'norm.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 2024-05-18 23:49:45 | ERROR | stderr | Traceback (most recent call last): 2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/anaconda3/envs/internvl/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2024-05-18 23:49:45 | ERROR | stderr | return _run_code(code, main_globals, None, 2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/anaconda3/envs/internvl/lib/python3.10/runpy.py", line 86, in _run_code 2024-05-18 23:49:45 | ERROR | stderr | exec(code, run_globals) 2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/workspace/InternVL/internvl_chat_llava/llava/serve/model_worker.py", line 275, in 2024-05-18 23:49:45 | ERROR | stderr | worker = ModelWorker(args.controller_address, 2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/workspace/InternVL/internvl_chat_llava/llava/serve/model_worker.py", line 65, in init 2024-05-18 23:49:45 | ERROR | stderr | self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model( 2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/workspace/InternVL/internvl_chat_llava/llava/model/builder.py", line 138, in load_pretrained_model 2024-05-18 23:49:45 | ERROR | stderr | if not vision_tower.is_loaded: 2024-05-18 23:49:45 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'is_loaded'

Could you please help with resolving this issue?

Thanks in advance!

P.S. , pip list output as following: Package Version


accelerate 0.30.1 addict 2.4.0 aiofiles 23.2.1 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.6.0 anyio 4.3.0 apex 0.1 attrs 23.2.0 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cmake 3.29.3 colorama 0.4.6 contourpy 1.2.1 crcmod 1.7 cryptography 42.0.7 cycler 0.12.1 deepspeed 0.13.5 dnspython 2.6.1 einops 0.8.0 email_validator 2.1.1 exceptiongroup 1.2.1 fastapi 0.111.0 fastapi-cli 0.0.3 ffmpy 0.3.2 filelock 3.14.0 fire 0.6.0 flash-attn 2.3.6 fonttools 4.51.0 fsspec 2024.5.0 gradio 3.50.2 gradio_client 0.6.1 h11 0.14.0 hjson 3.1.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.0 idna 3.7 importlib_metadata 7.1.0 importlib_resources 6.4.0 Jinja2 3.1.4 jmespath 0.10.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 lit 18.1.4 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.9.0 mdurl 0.1.2 mmcv-full 1.6.2 mmengine-lite 0.10.4 model-index 0.1.11 mpmath 1.3.0 networkx 3.3 ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 opencv-python 4.9.0.80 opendatalab 0.0.10 openmim 0.3.9 openxlab 0.0.26 ordered-set 4.1.0 orjson 3.10.3 oss2 2.17.0 packaging 24.0 pandas 2.2.2 peft 0.9.0 pillow 10.3.0 pip 24.0 platformdirs 4.2.2 protobuf 5.26.1 psutil 5.9.8 py-cpuinfo 9.0.0 pycocoevalcap 1.2 pycocotools 2.0.7 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.7.1 pydantic_core 2.18.2 pydub 0.25.1 Pygments 2.18.0 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2023.4 PyYAML 6.0.1 referencing 0.35.1 regex 2024.5.15 requests 2.28.2 rich 13.4.2 rpds-py 0.18.1 ruff 0.4.4 safetensors 0.4.3 scipy 1.13.0 semantic-version 2.10.0 sentencepiece 0.2.0 setuptools 60.2.0 shellingham 1.5.4 shortuuid 1.0.13 six 1.16.0 sniffio 1.3.1 starlette 0.37.2 sympy 1.12 tabulate 0.9.0 termcolor 2.4.0 tiktoken 0.7.0 timm 0.9.12 tokenizers 0.15.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.0.1+cu118 torchaudio 2.0.2+cu118 torchvision 0.15.2+cu118 tqdm 4.65.2 transformers 4.37.2 triton 2.0.0 typer 0.12.3 typing_extensions 4.11.0 tzdata 2024.1 ujson 5.10.0 urllib3 1.26.18 uvicorn 0.29.0 uvloop 0.19.0 watchfiles 0.21.0 websockets 11.0.3 wheel 0.43.0 yacs 0.1.8 yapf 0.40.2 zipp 3.18.2

mobguang avatar May 18 '24 15:05 mobguang

Oh, I figured it out. You should go to the internvl_chat folder and execute the following command:

python3.10 -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000/ --port 40000 --device auto --worker http://localhost:40000/ --model-path /root/onethingai-tmp/models/InternVL-Chat-V1-5

You should use internvl.serve.model_worker instead of llava.serve.model_worker.

czczup avatar May 19 '24 11:05 czczup

@czczup

Yes, I missed the critical information from the documentation. It was worked after I changed the folder path and using "internvl.serve.model_worker".

Thank you very much!

P.S., is the any plan for supporting Apple M series CPU? It is very cost-effective choice for LLM application development. Thanks.

mobguang avatar May 19 '24 15:05 mobguang

Yes, I have considered Mac, but I've been quite busy with work lately and haven't had the time to try it out. On the other hand, my Mac only has 16GB of memory, which might not be sufficient to debug this 26B model.

But we recently released smaller models with 2B and 4B parameters, which I might be able to deploy on my device.

czczup avatar May 30 '24 14:05 czczup

Great!

mobguang avatar May 30 '24 15:05 mobguang