DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Dont_change_device for parameters in initialization

Open lelandfy opened this issue 2 years ago • 3 comments

When I was running model training by Zero offload, to save the GPU memory I make the model weights initialized on CPU memory too by setting up deepspeed.zero.Init(remote_device="cpu", dtype=torch.half, enabled=False). Although the model weight is really initialized on CPU memory, but after deepspeed.initialzed(), the model still move to GPU memory. So I am wondering

  1. in Zero offload (stage 3, offload to cpu/nvme) is it possible the weight of the model stay mainly on CPU memory/Nvme but only is loaded layer by layer to GPU memory?
  2. I found in engine.py (actually it is called by deepspeed.initialize()) there is an argument dont_change_device (link)[https://github.com/microsoft/DeepSpeed/blob/4ae3a3da0dfd19d7ab7a76e7c742ac12f44fc1c0/deepspeed/runtime/engine.py#L1138-L1139] which controls whether or not the model weights is moved to GPU memory. But I also found no place to call dont_change_device. So my question is how to use dont_change_device and is it used to retain the model weight on CPU memory?

lelandfy avatar Mar 06 '23 08:03 lelandfy

@larry-fuy, to enable zero stage 3 offloading to cpu/nvme, enabled must be True in deepspeed.zero.Init(). Please see this tutorial for using this feature (a.k.a., zero-infinity). Here are answers to your specific questions:

  1. Streaming layer weights into GPU from CPU/NVMe, on-demand, as you have described is one of the features of zero-infinity. You can configure the "offload_params" in the ds_config to control this behavior.
  2. You should not need to manipulate dont_change_device. We can revisit this if the above suggestions don't work for you.

Thanks!

tjruwase avatar Mar 06 '23 11:03 tjruwase

I'd like to set this option "dont_change_device" because deepspeed is calling the module.to() which raises an error in transformers with my 4-bits model:

ValueError: `.to` is not supported for `4-bit` or `8-bit` models. Please use the
model as it is, since the model has already been set to the correct devices and 
casted to the correct `dtype`.

cerisara avatar Jun 19 '23 16:06 cerisara

I'd like to set this option "dont_change_device" because deepspeed is calling the module.to() which raises an error in transformers with my 4-bits model:

ValueError: `.to` is not supported for `4-bit` or `8-bit` models. Please use the
model as it is, since the model has already been set to the correct devices and 
casted to the correct `dtype`.

Same issue for BitsAndBytes Llama3.1 70B Instruct 4bit model

tripathiarpan20 avatar Sep 02 '24 08:09 tripathiarpan20