DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

The model size does not change

Open Twilighter9527 opened this issue 2 years ago • 3 comments

When I follow this https://www.deepspeed.ai/tutorials/model-compression/#2-tutorial-for-zeroquant-efficient-and-affordable-post-training-quantization run the zero_quant.sh or (quant_activation.sh and quant_weight.sh), the model size still is 418mb as the bert-base. image the clean_model weight still save as float32? Can u help me ? Thanks. image

Twilighter9527 avatar Mar 01 '23 08:03 Twilighter9527

Not just that, inference time does not improve either, nor does peak memory. I did not do this tutorial, but am experiencing the same results after applying zero quant to a bert uncased trained with MRPC. Is there a tutorial that shows improvements? The same can be said for XTC...

berserkr avatar Mar 29 '23 22:03 berserkr

Same for me...

Anastasia0411 avatar Mar 31 '23 14:03 Anastasia0411

same for me too

liubai521 avatar Oct 29 '23 11:10 liubai521