DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[REQUEST] Add a command line argument in `zero_to_fp32.py` to only merge trainable parameters

Open Sanster opened this issue 2 years ago • 0 comments

Is your feature request related to a problem? Please describe. I am using LoRA + deepspeed stage3 to train the Bloom model. One of the advantages of using LoRA is that the saved weight file is very small, which makes deployment and distribution easier.

PR https://github.com/microsoft/DeepSpeed/pull/3205 enable checkpoint load/save of frozen params in zero stage 3 when running zero_to_fp32.py. When using LoRA training, the merged model will become as large as the full model.

Describe the solution you'd like Add a command line argument to zero_to_fp32.py, for example:

parser.add_argument("--only_trainable_params", action='store_true', help="Only merge trainable parameters")

If --only_trainable_params is enabled, skip _zero2_merge_frozen_params/_zero3_merge_frozen_params

Sanster avatar May 04 '23 03:05 Sanster