DeepSpeedExamples
DeepSpeedExamples copied to clipboard
The model size does not change
When I follow this https://www.deepspeed.ai/tutorials/model-compression/#2-tutorial-for-zeroquant-efficient-and-affordable-post-training-quantization run the zero_quant.sh or (quant_activation.sh and quant_weight.sh), the model size still is 418mb as the bert-base.
the clean_model weight still save as float32? Can u help me ? Thanks.