InternVL
InternVL copied to clipboard
An error is reported when the model is loaded
Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /data/workspace/models/InternVL-C hat-V1-5 and are newly initialized: You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. AttributeError: 'NoneType' object has no attribute 'is_loaded'
Hello, thank you for your interest, can you provide more information, such as what command is used to load the model?
@czczup
I encountered same issue.
I downloaded model file from https://modelscope.cn/models/AI-ModelScope/InternVL-Chat-V1-5/files.
I tried to follow the instruction from https://github.com/OpenGVLab/InternVL/blob/main/document/how_to_deploy_a_local_demo.md to run gradio demo.
And I executed following command: python3.10 -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --device auto --worker http://localhost:40000 --model-path /root/onethingai-tmp/models/InternVL-Chat-V1-5
The output from console as following:
Discovered apex.normalization.FusedRMSNorm - will use it instead of LlamaRMSNorm
2024-05-18 23:42:55 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000', controller_address='http://localhost:10000', model_path='/root/onethingai-tmp/models/InternVL-Chat-V1-5', model_base=None, model_name=None, device='auto', multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False)
2024-05-18 23:42:55 | INFO | model_worker | Loading the model InternVL-Chat-V1-5 on worker 8b8837 ...
The repository for /root/onethingai-tmp/models/InternVL-Chat-V1-5 contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//root/onethingai-tmp/models/InternVL-Chat-V1-5.
You can avoid this prompt in future by passing the argument trust_remote_code=True
.
Do you wish to run the custom code? [y/N] y
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are using a model of type internvl_chat to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards: 0%| | 0/11 [00:00<?, ?it/s]
Loading checkpoint shards: 9%|███████▌ | 1/11 [00:00<00:02, 4.16it/s]
Loading checkpoint shards: 27%|██████████████████████▋ | 3/11 [00:00<00:00, 8.34it/s]
Loading checkpoint shards: 45%|█████████████████████████████████████▋ | 5/11 [00:00<00:00, 10.63it/s]
Loading checkpoint shards: 64%|████████████████████████████████████████████████████▊ | 7/11 [00:00<00:00, 11.94it/s]
Loading checkpoint shards: 82%|███████████████████████████████████████████████████████████████████▉ | 9/11 [00:00<00:00, 12.76it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 13.20it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 11.51it/s]
2024-05-18 23:49:32 | ERROR | stderr |
Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /root/onethingai-tmp/models/InternVL-Chat-V1-5 and are newly initialized: ['embed_tokens.weight', 'layers.0.input_layernorm.weight', 'layers.0.mlp.down_proj.weight', 'layers.0.mlp.gate_proj.weight', 'layers.0.mlp.up_proj.weight', 'layers.0.post_attention_layernorm.weight', 'layers.0.self_attn.k_proj.weight', 'layers.0.self_attn.o_proj.weight', 'layers.0.self_attn.q_proj.weight', 'layers.0.self_attn.v_proj.weight', 'layers.1.input_layernorm.weight', 'layers.1.mlp.down_proj.weight', 'layers.1.mlp.gate_proj.weight', 'layers.1.mlp.up_proj.weight', 'layers.1.post_attention_layernorm.weight', 'layers.1.self_attn.k_proj.weight', 'layers.1.self_attn.o_proj.weight', 'layers.1.self_attn.q_proj.weight', 'layers.1.self_attn.v_proj.weight', 'layers.10.input_layernorm.weight', 'layers.10.mlp.down_proj.weight', 'layers.10.mlp.gate_proj.weight', 'layers.10.mlp.up_proj.weight', 'layers.10.post_attention_layernorm.weight', 'layers.10.self_attn.k_proj.weight', 'layers.10.self_attn.o_proj.weight', 'layers.10.self_attn.q_proj.weight', 'layers.10.self_attn.v_proj.weight', 'layers.11.input_layernorm.weight', 'layers.11.mlp.down_proj.weight', 'layers.11.mlp.gate_proj.weight', 'layers.11.mlp.up_proj.weight', 'layers.11.post_attention_layernorm.weight', 'layers.11.self_attn.k_proj.weight', 'layers.11.self_attn.o_proj.weight', 'layers.11.self_attn.q_proj.weight', 'layers.11.self_attn.v_proj.weight', 'layers.12.input_layernorm.weight', 'layers.12.mlp.down_proj.weight', 'layers.12.mlp.gate_proj.weight', 'layers.12.mlp.up_proj.weight', 'layers.12.post_attention_layernorm.weight', 'layers.12.self_attn.k_proj.weight', 'layers.12.self_attn.o_proj.weight', 'layers.12.self_attn.q_proj.weight', 'layers.12.self_attn.v_proj.weight', 'layers.13.input_layernorm.weight', 'layers.13.mlp.down_proj.weight', 'layers.13.mlp.gate_proj.weight', 'layers.13.mlp.up_proj.weight', 'layers.13.post_attention_layernorm.weight', 'layers.13.self_attn.k_proj.weight', 'layers.13.self_attn.o_proj.weight', 'layers.13.self_attn.q_proj.weight', 'layers.13.self_attn.v_proj.weight', 'layers.14.input_layernorm.weight', 'layers.14.mlp.down_proj.weight', 'layers.14.mlp.gate_proj.weight', 'layers.14.mlp.up_proj.weight', 'layers.14.post_attention_layernorm.weight', 'layers.14.self_attn.k_proj.weight', 'layers.14.self_attn.o_proj.weight', 'layers.14.self_attn.q_proj.weight', 'layers.14.self_attn.v_proj.weight', 'layers.15.input_layernorm.weight', 'layers.15.mlp.down_proj.weight', 'layers.15.mlp.gate_proj.weight', 'layers.15.mlp.up_proj.weight', 'layers.15.post_attention_layernorm.weight', 'layers.15.self_attn.k_proj.weight', 'layers.15.self_attn.o_proj.weight', 'layers.15.self_attn.q_proj.weight', 'layers.15.self_attn.v_proj.weight', 'layers.16.input_layernorm.weight', 'layers.16.mlp.down_proj.weight', 'layers.16.mlp.gate_proj.weight', 'layers.16.mlp.up_proj.weight', 'layers.16.post_attention_layernorm.weight', 'layers.16.self_attn.k_proj.weight', 'layers.16.self_attn.o_proj.weight', 'layers.16.self_attn.q_proj.weight', 'layers.16.self_attn.v_proj.weight', 'layers.17.input_layernorm.weight', 'layers.17.mlp.down_proj.weight', 'layers.17.mlp.gate_proj.weight', 'layers.17.mlp.up_proj.weight', 'layers.17.post_attention_layernorm.weight', 'layers.17.self_attn.k_proj.weight', 'layers.17.self_attn.o_proj.weight', 'layers.17.self_attn.q_proj.weight', 'layers.17.self_attn.v_proj.weight', 'layers.18.input_layernorm.weight', 'layers.18.mlp.down_proj.weight', 'layers.18.mlp.gate_proj.weight', 'layers.18.mlp.up_proj.weight', 'layers.18.post_attention_layernorm.weight', 'layers.18.self_attn.k_proj.weight', 'layers.18.self_attn.o_proj.weight', 'layers.18.self_attn.q_proj.weight', 'layers.18.self_attn.v_proj.weight', 'layers.19.input_layernorm.weight', 'layers.19.mlp.down_proj.weight', 'layers.19.mlp.gate_proj.weight', 'layers.19.mlp.up_proj.weight', 'layers.19.post_attention_layernorm.weight', 'layers.19.self_attn.k_proj.weight', 'layers.19.self_attn.o_proj.weight', 'layers.19.self_attn.q_proj.weight', 'layers.19.self_attn.v_proj.weight', 'layers.2.input_layernorm.weight', 'layers.2.mlp.down_proj.weight', 'layers.2.mlp.gate_proj.weight', 'layers.2.mlp.up_proj.weight', 'layers.2.post_attention_layernorm.weight', 'layers.2.self_attn.k_proj.weight', 'layers.2.self_attn.o_proj.weight', 'layers.2.self_attn.q_proj.weight', 'layers.2.self_attn.v_proj.weight', 'layers.20.input_layernorm.weight', 'layers.20.mlp.down_proj.weight', 'layers.20.mlp.gate_proj.weight', 'layers.20.mlp.up_proj.weight', 'layers.20.post_attention_layernorm.weight', 'layers.20.self_attn.k_proj.weight', 'layers.20.self_attn.o_proj.weight', 'layers.20.self_attn.q_proj.weight', 'layers.20.self_attn.v_proj.weight', 'layers.21.input_layernorm.weight', 'layers.21.mlp.down_proj.weight', 'layers.21.mlp.gate_proj.weight', 'layers.21.mlp.up_proj.weight', 'layers.21.post_attention_layernorm.weight', 'layers.21.self_attn.k_proj.weight', 'layers.21.self_attn.o_proj.weight', 'layers.21.self_attn.q_proj.weight', 'layers.21.self_attn.v_proj.weight', 'layers.22.input_layernorm.weight', 'layers.22.mlp.down_proj.weight', 'layers.22.mlp.gate_proj.weight', 'layers.22.mlp.up_proj.weight', 'layers.22.post_attention_layernorm.weight', 'layers.22.self_attn.k_proj.weight', 'layers.22.self_attn.o_proj.weight', 'layers.22.self_attn.q_proj.weight', 'layers.22.self_attn.v_proj.weight', 'layers.23.input_layernorm.weight', 'layers.23.mlp.down_proj.weight', 'layers.23.mlp.gate_proj.weight', 'layers.23.mlp.up_proj.weight', 'layers.23.post_attention_layernorm.weight', 'layers.23.self_attn.k_proj.weight', 'layers.23.self_attn.o_proj.weight', 'layers.23.self_attn.q_proj.weight', 'layers.23.self_attn.v_proj.weight', 'layers.24.input_layernorm.weight', 'layers.24.mlp.down_proj.weight', 'layers.24.mlp.gate_proj.weight', 'layers.24.mlp.up_proj.weight', 'layers.24.post_attention_layernorm.weight', 'layers.24.self_attn.k_proj.weight', 'layers.24.self_attn.o_proj.weight', 'layers.24.self_attn.q_proj.weight', 'layers.24.self_attn.v_proj.weight', 'layers.25.input_layernorm.weight', 'layers.25.mlp.down_proj.weight', 'layers.25.mlp.gate_proj.weight', 'layers.25.mlp.up_proj.weight', 'layers.25.post_attention_layernorm.weight', 'layers.25.self_attn.k_proj.weight', 'layers.25.self_attn.o_proj.weight', 'layers.25.self_attn.q_proj.weight', 'layers.25.self_attn.v_proj.weight', 'layers.26.input_layernorm.weight', 'layers.26.mlp.down_proj.weight', 'layers.26.mlp.gate_proj.weight', 'layers.26.mlp.up_proj.weight', 'layers.26.post_attention_layernorm.weight', 'layers.26.self_attn.k_proj.weight', 'layers.26.self_attn.o_proj.weight', 'layers.26.self_attn.q_proj.weight', 'layers.26.self_attn.v_proj.weight', 'layers.27.input_layernorm.weight', 'layers.27.mlp.down_proj.weight', 'layers.27.mlp.gate_proj.weight', 'layers.27.mlp.up_proj.weight', 'layers.27.post_attention_layernorm.weight', 'layers.27.self_attn.k_proj.weight', 'layers.27.self_attn.o_proj.weight', 'layers.27.self_attn.q_proj.weight', 'layers.27.self_attn.v_proj.weight', 'layers.28.input_layernorm.weight', 'layers.28.mlp.down_proj.weight', 'layers.28.mlp.gate_proj.weight', 'layers.28.mlp.up_proj.weight', 'layers.28.post_attention_layernorm.weight', 'layers.28.self_attn.k_proj.weight', 'layers.28.self_attn.o_proj.weight', 'layers.28.self_attn.q_proj.weight', 'layers.28.self_attn.v_proj.weight', 'layers.29.input_layernorm.weight', 'layers.29.mlp.down_proj.weight', 'layers.29.mlp.gate_proj.weight', 'layers.29.mlp.up_proj.weight', 'layers.29.post_attention_layernorm.weight', 'layers.29.self_attn.k_proj.weight', 'layers.29.self_attn.o_proj.weight', 'layers.29.self_attn.q_proj.weight', 'layers.29.self_attn.v_proj.weight', 'layers.3.input_layernorm.weight', 'layers.3.mlp.down_proj.weight', 'layers.3.mlp.gate_proj.weight', 'layers.3.mlp.up_proj.weight', 'layers.3.post_attention_layernorm.weight', 'layers.3.self_attn.k_proj.weight', 'layers.3.self_attn.o_proj.weight', 'layers.3.self_attn.q_proj.weight', 'layers.3.self_attn.v_proj.weight', 'layers.30.input_layernorm.weight', 'layers.30.mlp.down_proj.weight', 'layers.30.mlp.gate_proj.weight', 'layers.30.mlp.up_proj.weight', 'layers.30.post_attention_layernorm.weight', 'layers.30.self_attn.k_proj.weight', 'layers.30.self_attn.o_proj.weight', 'layers.30.self_attn.q_proj.weight', 'layers.30.self_attn.v_proj.weight', 'layers.31.input_layernorm.weight', 'layers.31.mlp.down_proj.weight', 'layers.31.mlp.gate_proj.weight', 'layers.31.mlp.up_proj.weight', 'layers.31.post_attention_layernorm.weight', 'layers.31.self_attn.k_proj.weight', 'layers.31.self_attn.o_proj.weight', 'layers.31.self_attn.q_proj.weight', 'layers.31.self_attn.v_proj.weight', 'layers.4.input_layernorm.weight', 'layers.4.mlp.down_proj.weight', 'layers.4.mlp.gate_proj.weight', 'layers.4.mlp.up_proj.weight', 'layers.4.post_attention_layernorm.weight', 'layers.4.self_attn.k_proj.weight', 'layers.4.self_attn.o_proj.weight', 'layers.4.self_attn.q_proj.weight', 'layers.4.self_attn.v_proj.weight', 'layers.5.input_layernorm.weight', 'layers.5.mlp.down_proj.weight', 'layers.5.mlp.gate_proj.weight', 'layers.5.mlp.up_proj.weight', 'layers.5.post_attention_layernorm.weight', 'layers.5.self_attn.k_proj.weight', 'layers.5.self_attn.o_proj.weight', 'layers.5.self_attn.q_proj.weight', 'layers.5.self_attn.v_proj.weight', 'layers.6.input_layernorm.weight', 'layers.6.mlp.down_proj.weight', 'layers.6.mlp.gate_proj.weight', 'layers.6.mlp.up_proj.weight', 'layers.6.post_attention_layernorm.weight', 'layers.6.self_attn.k_proj.weight', 'layers.6.self_attn.o_proj.weight', 'layers.6.self_attn.q_proj.weight', 'layers.6.self_attn.v_proj.weight', 'layers.7.input_layernorm.weight', 'layers.7.mlp.down_proj.weight', 'layers.7.mlp.gate_proj.weight', 'layers.7.mlp.up_proj.weight', 'layers.7.post_attention_layernorm.weight', 'layers.7.self_attn.k_proj.weight', 'layers.7.self_attn.o_proj.weight', 'layers.7.self_attn.q_proj.weight', 'layers.7.self_attn.v_proj.weight', 'layers.8.input_layernorm.weight', 'layers.8.mlp.down_proj.weight', 'layers.8.mlp.gate_proj.weight', 'layers.8.mlp.up_proj.weight', 'layers.8.post_attention_layernorm.weight', 'layers.8.self_attn.k_proj.weight', 'layers.8.self_attn.o_proj.weight', 'layers.8.self_attn.q_proj.weight', 'layers.8.self_attn.v_proj.weight', 'layers.9.input_layernorm.weight', 'layers.9.mlp.down_proj.weight', 'layers.9.mlp.gate_proj.weight', 'layers.9.mlp.up_proj.weight', 'layers.9.post_attention_layernorm.weight', 'layers.9.self_attn.k_proj.weight', 'layers.9.self_attn.o_proj.weight', 'layers.9.self_attn.q_proj.weight', 'layers.9.self_attn.v_proj.weight', 'lm_head.weight', 'norm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2024-05-18 23:49:45 | ERROR | stderr | Traceback (most recent call last):
2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/anaconda3/envs/internvl/lib/python3.10/runpy.py", line 196, in _run_module_as_main
2024-05-18 23:49:45 | ERROR | stderr | return _run_code(code, main_globals, None,
2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/anaconda3/envs/internvl/lib/python3.10/runpy.py", line 86, in _run_code
2024-05-18 23:49:45 | ERROR | stderr | exec(code, run_globals)
2024-05-18 23:49:45 | ERROR | stderr | File "/root/onethingai-tmp/workspace/InternVL/internvl_chat_llava/llava/serve/model_worker.py", line 275, in
Could you please help with resolving this issue?
Thanks in advance!
P.S. , pip list output as following: Package Version
accelerate 0.30.1 addict 2.4.0 aiofiles 23.2.1 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.6.0 anyio 4.3.0 apex 0.1 attrs 23.2.0 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cmake 3.29.3 colorama 0.4.6 contourpy 1.2.1 crcmod 1.7 cryptography 42.0.7 cycler 0.12.1 deepspeed 0.13.5 dnspython 2.6.1 einops 0.8.0 email_validator 2.1.1 exceptiongroup 1.2.1 fastapi 0.111.0 fastapi-cli 0.0.3 ffmpy 0.3.2 filelock 3.14.0 fire 0.6.0 flash-attn 2.3.6 fonttools 4.51.0 fsspec 2024.5.0 gradio 3.50.2 gradio_client 0.6.1 h11 0.14.0 hjson 3.1.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.0 idna 3.7 importlib_metadata 7.1.0 importlib_resources 6.4.0 Jinja2 3.1.4 jmespath 0.10.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 lit 18.1.4 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.9.0 mdurl 0.1.2 mmcv-full 1.6.2 mmengine-lite 0.10.4 model-index 0.1.11 mpmath 1.3.0 networkx 3.3 ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 opencv-python 4.9.0.80 opendatalab 0.0.10 openmim 0.3.9 openxlab 0.0.26 ordered-set 4.1.0 orjson 3.10.3 oss2 2.17.0 packaging 24.0 pandas 2.2.2 peft 0.9.0 pillow 10.3.0 pip 24.0 platformdirs 4.2.2 protobuf 5.26.1 psutil 5.9.8 py-cpuinfo 9.0.0 pycocoevalcap 1.2 pycocotools 2.0.7 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.7.1 pydantic_core 2.18.2 pydub 0.25.1 Pygments 2.18.0 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2023.4 PyYAML 6.0.1 referencing 0.35.1 regex 2024.5.15 requests 2.28.2 rich 13.4.2 rpds-py 0.18.1 ruff 0.4.4 safetensors 0.4.3 scipy 1.13.0 semantic-version 2.10.0 sentencepiece 0.2.0 setuptools 60.2.0 shellingham 1.5.4 shortuuid 1.0.13 six 1.16.0 sniffio 1.3.1 starlette 0.37.2 sympy 1.12 tabulate 0.9.0 termcolor 2.4.0 tiktoken 0.7.0 timm 0.9.12 tokenizers 0.15.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.0.1+cu118 torchaudio 2.0.2+cu118 torchvision 0.15.2+cu118 tqdm 4.65.2 transformers 4.37.2 triton 2.0.0 typer 0.12.3 typing_extensions 4.11.0 tzdata 2024.1 ujson 5.10.0 urllib3 1.26.18 uvicorn 0.29.0 uvloop 0.19.0 watchfiles 0.21.0 websockets 11.0.3 wheel 0.43.0 yacs 0.1.8 yapf 0.40.2 zipp 3.18.2
Oh, I figured it out. You should go to the internvl_chat
folder and execute the following command:
python3.10 -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000/ --port 40000 --device auto --worker http://localhost:40000/ --model-path /root/onethingai-tmp/models/InternVL-Chat-V1-5
You should use internvl.serve.model_worker
instead of llava.serve.model_worker
.
@czczup
Yes, I missed the critical information from the documentation. It was worked after I changed the folder path and using "internvl.serve.model_worker".
Thank you very much!
P.S., is the any plan for supporting Apple M series CPU? It is very cost-effective choice for LLM application development. Thanks.
Yes, I have considered Mac, but I've been quite busy with work lately and haven't had the time to try it out. On the other hand, my Mac only has 16GB of memory, which might not be sufficient to debug this 26B model.
But we recently released smaller models with 2B and 4B parameters, which I might be able to deploy on my device.
Great!