accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Error loading AutoModelForCausalLM with map_device="auto", load_in_8bit=True and fp16=True / weight is on the meta device

Open JosephChotard opened this issue 2 years ago • 2 comments

System Info

- `Accelerate` version: 0.19.0
- Platform: Linux-3.10.0-1160.88.1.el7.x86_64-x86_64-with-glibc2.31
- Python version: 3.10.11
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1 (True)
- System RAM: 61.98 GB
- GPU type: Tesla T4
- `Accelerate` default config:
	Not found

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [X] My own task or dataset (give details below)

Reproduction

Using facebook/opt-13b, with load_in_8bit=True fails when trying to load the model with the following code:

AutoModelForCausalLM.from_pretrained(
            model_folder,
            local_files_only=True,
            torch_dtype=torch.float16,
            load_in_8bit=True,
            device_map="auto",
        )
        .to(device)
        .cuda()

It fails with the following message:

ValueError: weight is on the meta device, we need a `value` to put in on 0.

Seems related to this issue #1335 but I'm running transformers 4.29.2 so should be fixed? I am running this in docker but I don't believe that should impact it.

Expected behavior

I would expect the model to load as usual. Am I missing something dumb?

JosephChotard avatar Jun 05 '23 11:06 JosephChotard

Hi @JosephChotard , thanks for reporting. I was not able to reproduce the error in my colab notebook. Please check that you have the latest version of bitsandbytes. This is probably where the error comes from. Furthermore, for model loaded in 8-bit or 4-bit, .to() is not supported. You can use the model as it is thanks to device_map = 'auto'. The model will be dispatched correctly to your available ressources. Let me know if it works

SunMarc avatar Jun 05 '23 14:06 SunMarc

You also cannot do to(xxx) or cuda() for a model loaded with device_map='auto'. The model will already be loaded on the GPUs you hava available.

sgugger avatar Jun 05 '23 15:06 sgugger

Hello, is there any solution now? @JosephChotard I meet with the same issue when loading BLOOM model. My transformers version is 4.31.0.dev0 and my bitsandbytes version is 0.39. Both are the latest version.

TingchenFu avatar Jun 17 '23 13:06 TingchenFu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 11 '23 15:07 github-actions[bot]