LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Usage] Unable to load LLaVA v1.6 models

Open levi opened this issue 1 year ago • 14 comments

Describe the issue

Issue:

When trying to load liuhaotian/llava-v1.6-mistral-7b or liuhaotian/llava-v1.6-34b into my container:

MODEL_PATH = "liuhaotian/llava-v1.6-mistral-7b"
USE_8BIT = False
USE_4BIT = False
DEVICE = "cuda"

def download_llava_model():
    from llava.model.builder import load_pretrained_model
    from llava.mm_utils import get_model_name_from_path

    model_name = get_model_name_from_path(MODEL_PATH)
    load_pretrained_model(
        MODEL_PATH, None, model_name, USE_8BIT, USE_4BIT, device=DEVICE
    )

Seeing this error:

  File "/scripts/llava.py", line 23, in download_llava_model
    load_pretrained_model(
  File "/root/llava/llava/model/builder.py", line 151, in load_pretrained_model
    vision_tower.to(device=device, dtype=torch.float16)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 4 more times]
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!

levi avatar Jan 31 '24 23:01 levi

Same error. I think they have not updated the code for v1.6.

Gutianpei avatar Jan 31 '24 23:01 Gutianpei

For a brief moment I got it to build, but same error after reproing.

levi avatar Feb 01 '24 00:02 levi

For a brief moment I got it to build, but same error after reproing.

I made it working by updating pytorch/transformers to latest version and following this issue #1036

Gutianpei avatar Feb 01 '24 00:02 Gutianpei

Tried various pytorch versions, no luck. Installing from an empty container image and pip installing the repo like described in the readme.

levi avatar Feb 01 '24 01:02 levi

I have the same problem :(

ninatu avatar Feb 01 '24 17:02 ninatu

Bumping vram to 80GB resolved the issue for me. Possibly an OOM error in disguise?

levi avatar Feb 01 '24 17:02 levi

@levi thanks! That helped!

ninatu avatar Feb 02 '24 10:02 ninatu

  File "A:\Utilities\DescriptingLLaVa\LLaVA\BatchCaptionFolder.py", line 35, in <module>
    tokenizer, model, image_processor, context_len = load_pretrained_model(
  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\builder.py", line 108, in load_pretrained_model
    model = LlavaMistralForCausalLM.from_pretrained(
NameError: name 'LlavaMistralForCausalLM' is not defined

I have this error, also launching 1.6 mistral7b and have no luck making it work Happens during the model load and initialization here:

from llava.model.builder import load_pretrained_model
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path, model_base, model_name, load_8bit=False, load_4bit=False,device_map='cuda:0', device='cuda:0')

Pirog17000 avatar Feb 02 '24 19:02 Pirog17000

Noticed a typo? image

Pirog17000 avatar Feb 03 '24 01:02 Pirog17000

If I comment out broken references in builder.py and replace it with direct imports as follow:

#from llava.model import *
from llava.model.language_model.llava_llama import LlavaLlamaForCausalLM
from llava.model.language_model.llava_mpt import LlavaMptForCausalLM as LlavaMPTForCausalLM
from llava.model.language_model.llava_mistral import LlavaMistralForCausalLM

then I got another issue:


  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\builder.py", line 23, in <module>
    from llava.model.language_model.llava_llama import LlavaLlamaForCausalLM
  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\language_model\llava_llama.py", line 21, in <module>
    from transformers import AutoConfig, AutoModelForCausalLM, \
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1373, in __getattr__
    value = getattr(module, name)
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1372, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
Failed to import transformers.integrations.peft because of the following error (look up to see its traceback):
DLL load failed while importing libtriton: The specified module could not be found.

Pirog17000 avatar Feb 03 '24 01:02 Pirog17000

Bumping vram to 80GB resolved the issue for me. Possibly an OOM error in disguise?

How does one "bump vram to 80GB"?

rossgreer avatar Feb 03 '24 02:02 rossgreer

You can inference with 4-bit quantization, which would fit the largest 34B variant in a 24GB GPU.

haotian-liu avatar Feb 03 '24 05:02 haotian-liu

@haotian-liu I try to run the 4-bit 34B on 24GB Ram but I'm pretty sure it offloads some of the weights to cpu, because of low_cpu_mem_usage=True which results in the above error NotImplementedError: Cannot copy out of meta tensor; no data!

matankley avatar Feb 06 '24 09:02 matankley

@matankley

This is a demo loaded with 4-bit quantization on A10G (24G). Please check out the latest code base and retry, and if it does not work, please kindly share the commands you're using. THank you.

haotian-liu avatar Feb 06 '24 16:02 haotian-liu

For me, vision_tower.is_loaded() wasn't functioning as anticipated. Manually executing vision_tower.load_model() resolved the issue.

tonywang10101 avatar Mar 13 '24 23:03 tonywang10101

@haotian-liu Thanks for your great research! May I know, the download time too slow for me at here, if i want to save the model (15 .safetensors) to Google Drive - image

However, in google drive I am getting just 8 of 8 only (20GB), which has is half of the loaded model. Is that other save method for your LlavaLlamaForCausalLM ?

hkfisherman avatar Jun 28 '24 02:06 hkfisherman