LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Question] ValueError: weight is on the meta device, we need a `value` to put in on 0.

Open J-spacetime opened this issue 1 year ago • 4 comments

Question

(llava) ~/autodl-tmp/LLaVA# python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path liuhaotian/llava-v1-0719-336px-lora-vicuna-13b-v1.3 --model-base vicuna-13b-v1.3

[2023-10-13 17:34:01,478] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) 2023-10-13 17:34:01 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000', controller_address='http://localhost:10000', model_path='liuhaotian/llava-v1-0719-336px-lora-vicuna-13b-v1.3', model_base='vicuna-13b-v1.3', model_name=None, device='cuda', multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False) 2023-10-13 17:34:01 | INFO | model_worker | Loading the model llava-v1-0719-336px-lora-vicuna-13b-v1.3 on worker 56531b ... You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 2023-10-13 17:34:03 | INFO | stdout | Loading LLaVA from base model... Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 33%|████████████████████████████████████▎ | 1/3 [00:13<00:27, 13.68s/it] Loading checkpoint shards: 67%|████████████████████████████████████████████████████████████████████████▋ | 2/3 [00:24<00:11, 11.77s/it] Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:29<00:00, 8.88s/it] Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:29<00:00, 9.85s/it] 2023-10-13 17:34:33 | ERROR | stderr | 2023-10-13 17:34:34 | ERROR | stderr | Traceback (most recent call last): 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2023-10-13 17:34:34 | ERROR | stderr | return _run_code(code, main_globals, None, 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/runpy.py", line 86, in _run_code 2023-10-13 17:34:34 | ERROR | stderr | exec(code, run_globals) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/autodl-tmp/LLaVA/llava/serve/model_worker.py", line 275, in 2023-10-13 17:34:34 | ERROR | stderr | worker = ModelWorker(args.controller_address, 2023-10-13 17:34:34 | ERROR | stderr | File "/root/autodl-tmp/LLaVA/llava/serve/model_worker.py", line 65, in init 2023-10-13 17:34:34 | ERROR | stderr | self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model( 2023-10-13 17:34:34 | ERROR | stderr | File "/root/autodl-tmp/LLaVA/llava/model/builder.py", line 50, in load_pretrained_model 2023-10-13 17:34:34 | ERROR | stderr | model = LlavaLlamaForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=lora_cfg_pretrained, **kwargs) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2959, in from_pretrained 2023-10-13 17:34:34 | ERROR | stderr | dispatch_model(model, **kwargs) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/big_modeling.py", line 371, in dispatch_model 2023-10-13 17:34:34 | ERROR | stderr | attach_align_device_hook_on_blocks( 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 536, in attach_align_device_hook_on_blocks 2023-10-13 17:34:34 | ERROR | stderr | attach_align_device_hook_on_blocks( 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 536, in attach_align_device_hook_on_blocks 2023-10-13 17:34:34 | ERROR | stderr | attach_align_device_hook_on_blocks( 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 536, in attach_align_device_hook_on_blocks 2023-10-13 17:34:34 | ERROR | stderr | attach_align_device_hook_on_blocks( 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 506, in attach_align_device_hook_on_blocks 2023-10-13 17:34:34 | ERROR | stderr | add_hook_to_module(module, hook) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 155, in add_hook_to_module 2023-10-13 17:34:34 | ERROR | stderr | module = hook.init_hook(module) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/hooks.py", line 253, in init_hook 2023-10-13 17:34:34 | ERROR | stderr | set_module_tensor_to_device(module, name, self.execution_device) 2023-10-13 17:34:34 | ERROR | stderr | File "/root/miniconda3/envs/llava/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 267, in set_module_tensor_to_device 2023-10-13 17:34:34 | ERROR | stderr | raise ValueError(f"{tensor_name} is on the meta device, we need a value to put in on {device}.") 2023-10-13 17:34:34 | ERROR | stderr | ValueError: weight is on the meta device, we need a value to put in on 0.

My device: GPU 0: NVIDIA GeForce RTX 3090

How can I solve this problem?

J-spacetime avatar Oct 13 '23 09:10 J-spacetime

i have the same question, how to solve this problem?

xj66212 avatar Jan 06 '24 09:01 xj66212

same question, anyone solved it?

Yuezeyi avatar Feb 23 '24 10:02 Yuezeyi