unsloth
unsloth copied to clipboard
Qwen2-VL-72B 4Bit won't load
Hi!
I am having trouble loading the Qwen2-VL-72B models. Here's a minimal code snippet, which I run on a local machine using the standard conda install from the docs. I have tried with both a 32GB Tesla-V100 and an 80GB A100.
import torch
from trl import SFTTrainer, SFTConfig
from unsloth import FastVisionModel, is_bf16_supported # FastLanguageModel for LLMs
from unsloth.trainer import UnslothVisionDataCollator
model_hf_path = 'unsloth/Qwen2-VL-72B-Instruct-bnb-4bit'
model, tokenizer = FastVisionModel.from_pretrained(
model_hf_path,
load_in_4bit = True,
use_gradient_checkpointing = "unsloth",
)
I get the following error:
Traceback (most recent call last):
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
response.raise_for_status()
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/unsloth/qwen2-vl-72b-instruct-unsloth-bnb-4bit
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py", line 125, in _repo_and_revision_exist
self._api.repo_info(
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2748, in repo_info
return method(
^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2533, in model_info
hf_raise_for_status(r)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 454, in hf_raise_for_status
raise _format(RepositoryNotFoundError, message, response) from e
huggingface_hub.errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-677bb76a-7e60f5736021adae14bf2f7a;410212d4-27c9-42bb-9c2d-4d27d7962042)
Repository Not Found for url: https://huggingface.co/api/models/unsloth/qwen2-vl-72b-instruct-unsloth-bnb-4bit.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/unsloth/models/loader.py", line 413, in from_pretrained
files = HfFileSystem(token = token).glob(os.path.join(model_name, "*.json"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py", line 520, in glob
path = self.resolve_path(path, revision=kwargs.get("revision")).unresolve()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py", line 216, in resolve_path
_raise_file_not_found(path, err)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py", line 1136, in _raise_file_not_found
raise FileNotFoundError(msg) from err
FileNotFoundError: unsloth/qwen2-vl-72b-instruct-unsloth-bnb-4bit/*.json (repository not found)
Everything works perfectly with the 2B and 7B with these parameters. Furthermore, I do not get this error when I load the model with load_in_4bit = False
. The model starts to load, but then I get the following error, which suggests that I am running out of RAM (even on an A100), so I would really like to load in 4bit!
Loading checkpoint shards: 0%| | 0/31 [00:00<?, ?it/s]
Loading checkpoint shards: 3%|▎ | 1/31 [00:04<02:22, 4.75s/it]
Loading checkpoint shards: 6%|▋ | 2/31 [00:06<01:33, 3.22s/it]
Loading checkpoint shards: 10%|▉ | 3/31 [00:09<01:17, 2.77s/it]
Loading checkpoint shards: 13%|█▎ | 4/31 [00:11<01:07, 2.52s/it]
Loading checkpoint shards: 16%|█▌ | 5/31 [00:15<01:23, 3.22s/it]
Loading checkpoint shards: 19%|█▉ | 6/31 [00:20<01:37, 3.92s/it]
Loading checkpoint shards: 23%|██▎ | 7/31 [00:25<01:37, 4.06s/it]
Loading checkpoint shards: 26%|██▌ | 8/31 [00:29<01:35, 4.17s/it]
Loading checkpoint shards: 29%|██▉ | 9/31 [00:34<01:36, 4.40s/it]
Loading checkpoint shards: 32%|███▏ | 10/31 [00:38<01:30, 4.32s/it]
Loading checkpoint shards: 35%|███▌ | 11/31 [00:43<01:29, 4.47s/it]
Loading checkpoint shards: 39%|███▊ | 12/31 [00:49<01:30, 4.79s/it]
Loading checkpoint shards: 42%|████▏ | 13/31 [00:54<01:27, 4.86s/it]
Loading checkpoint shards: 45%|████▌ | 14/31 [00:59<01:25, 5.02s/it]
Loading checkpoint shards: 48%|████▊ | 15/31 [01:05<01:27, 5.45s/it]
Loading checkpoint shards: 52%|█████▏ | 16/31 [01:10<01:19, 5.29s/it]
Loading checkpoint shards: 55%|█████▍ | 17/31 [01:14<01:07, 4.84s/it]
Loading checkpoint shards: 58%|█████▊ | 18/31 [01:14<00:44, 3.43s/it]
Loading checkpoint shards: 61%|██████▏ | 19/31 [01:15<00:29, 2.45s/it]
Loading checkpoint shards: 65%|██████▍ | 20/31 [01:15<00:19, 1.76s/it]
Loading checkpoint shards: 68%|██████▊ | 21/31 [01:15<00:12, 1.28s/it]
Loading checkpoint shards: 71%|███████ | 22/31 [01:15<00:08, 1.06it/s]
Loading checkpoint shards: 74%|███████▍ | 23/31 [01:15<00:05, 1.41it/s]
Loading checkpoint shards: 77%|███████▋ | 24/31 [01:15<00:03, 1.81it/s]
Loading checkpoint shards: 81%|████████ | 25/31 [01:16<00:02, 2.28it/s]
Loading checkpoint shards: 84%|████████▍ | 26/31 [01:16<00:01, 2.78it/s]
Loading checkpoint shards: 87%|████████▋ | 27/31 [01:16<00:01, 3.27it/s]
Loading checkpoint shards: 90%|█████████ | 28/31 [01:16<00:00, 3.75it/s]
Loading checkpoint shards: 94%|█████████▎| 29/31 [01:16<00:00, 4.18it/s]
Loading checkpoint shards: 97%|█████████▋| 30/31 [01:16<00:00, 4.60it/s]
Loading checkpoint shards: 100%|██████████| 31/31 [01:17<00:00, 5.15it/s]
Loading checkpoint shards: 100%|██████████| 31/31 [01:17<00:00, 2.48s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Traceback (most recent call last):
File "./finetuning/src/finetune.py", line 106, in <module>
finetune(args.llm, args.split, args.data_path, args.res_path)
File "./finetuning/src/finetune.py", line 44, in finetune
trainer = SFTTrainer(
^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/unsloth/trainer.py", line 203, in new_init
original_init(self, *args, **kwargs)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/trl/trainer/sft_trainer.py", line 307, in __init__
super().__init__(
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/transformers/trainer.py", line 574, in __init__
self._move_model_to_device(model, args.device)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/transformers/trainer.py", line 846, in _move_model_to_device
model = model.to(device)
^^^^^^^^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "$HOMEminiconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
[Previous line repeated 5 more times]
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply
param_applied = fn(param)
^^^^^^^^^
File "$HOME/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1333, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
Thanks in advance, and let me know if I can provide any further details.