DeepSpeed [BUG] run_zero_quant.sh seems not working

[BUG] run_zero_quant.sh seems not working

Open rahuan opened this issue 1 year ago • 0 comments

Describe the bug I am trying run_zero_quant for GPT-J model, but I find the output model size is not compressed, the model file size is the same as origin model, also after I load the model, the GPU memory is the same as origin model,

To Reproduce Steps to reproduce the behavior:

Go to DeepSpeedExamples/model_compression/bert
Run: pip install -r requirements.txt bash bash_script/run_zero_quant.sh If I don't do any change to sh, it will use origin gpt-2 model, also I tried gpt-j model. Neither output model file size are compressed

Is there a bug? Could you please check if this issue can be reproduced?

Apr 21 '23 07:04 rahuan

DeepSpeed DeepSpeed copied to clipboard

[BUG] run_zero_quant.sh seems not working

DeepSpeed
DeepSpeed copied to clipboard