accelerate
accelerate copied to clipboard
Model not offloading to disk when RAM is full
System Info
accelerate 0.18.0
bitsandbytes 0.38.1
diffusers 0.15.1
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainerscript in theexamplesfolder of thetransformersrepo (such asrun_no_trainer_glue.py) - [ ] My own task or dataset (give details below)
Reproduction
Looks like accelerate is not offloading models to disk when RAM is occupied. Am I missing something?
Ran in machine with 16GB RAM
from transformers import AutoModelForCausalLM
import torch
checkpoint = "facebook/opt-6.7b"
model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", offload_folder="offload", offload_state_dict = True, torch_dtype=torch.float16)
from transformers import AutoConfig, Blip2ForConditionalGeneration,Blip2Processor
import torch
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-flan-t5-xl",device_map="auto", offload_folder="offload_1", offload_state_dict = True, torch_dtype=torch.float16)
Expected behavior
As per the docs, shouldn't it offload the model to disk when the RAM is full?
Yes it does. Since you're not describing the problem you encountered, I'm not sure how we can help. You still need to have enough RAM to load the checkpoint shards (which are 10GB each).
Yes it does. Since you're not describing the problem you encountered, I'm not sure how we can help. You still need to have enough RAM to load the checkpoint shards (which are 10GB each).
Hmm, makes sense. And, one more question. I need to load say 10 stable diffusion model each ~5GB in 16GB RAM. Is it possible? Does Accelerate take care of moving model to RAM/GPU, when needed?
Yes but it will be very slow unless you have a very fast hard drive. You will also need to limit the RAM used by the first models (since Accelerate takes all what is available by default) with the max_memory argument.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.