accelerate
accelerate copied to clipboard
Error loading AutoModelForCausalLM with map_device="auto", load_in_8bit=True and fp16=True / weight is on the meta device
System Info
- `Accelerate` version: 0.19.0
- Platform: Linux-3.10.0-1160.88.1.el7.x86_64-x86_64-with-glibc2.31
- Python version: 3.10.11
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1 (True)
- System RAM: 61.98 GB
- GPU type: Tesla T4
- `Accelerate` default config:
Not found
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainerscript in theexamplesfolder of thetransformersrepo (such asrun_no_trainer_glue.py) - [X] My own task or dataset (give details below)
Reproduction
Using facebook/opt-13b, with load_in_8bit=True fails when trying to load the model with the following code:
AutoModelForCausalLM.from_pretrained(
model_folder,
local_files_only=True,
torch_dtype=torch.float16,
load_in_8bit=True,
device_map="auto",
)
.to(device)
.cuda()
It fails with the following message:
ValueError: weight is on the meta device, we need a `value` to put in on 0.
Seems related to this issue #1335 but I'm running transformers 4.29.2 so should be fixed? I am running this in docker but I don't believe that should impact it.
Expected behavior
I would expect the model to load as usual. Am I missing something dumb?
Hi @JosephChotard , thanks for reporting. I was not able to reproduce the error in my colab notebook. Please check that you have the latest version of bitsandbytes. This is probably where the error comes from. Furthermore, for model loaded in 8-bit or 4-bit, .to() is not supported. You can use the model as it is thanks to device_map = 'auto'. The model will be dispatched correctly to your available ressources. Let me know if it works
You also cannot do to(xxx) or cuda() for a model loaded with device_map='auto'. The model will already be loaded on the GPUs you hava available.
Hello, is there any solution now? @JosephChotard I meet with the same issue when loading BLOOM model. My transformers version is 4.31.0.dev0 and my bitsandbytes version is 0.39. Both are the latest version.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.