DeepSpeedExamples XTC in DeepSpeed Compression does not work

XTC in DeepSpeed Compression does not work

Open Toan-Do opened this issue 3 years ago • 1 comments

Hi all,

Thanks for great works.

I ran some experiments with Deepspeed compression using configs in model_compression/bert. I got some issues:

Size of output model when using DeepSpeedExamples/model_compression/bert/bash_script/XTC/quant_1bit.sh config is the same with orginal Bert model. While in the blog (https://www.deepspeed.ai/tutorials/model-compression/#3-tutorial-for-xtc-simple-yet-effective-compression-pipeline-for-extreme-compression) shows the 1bit/2bit model size reduce 32 times comparing to original model. The same with model using DeepSpeedExamples/model_compression/bert/bash_script/XTC/layer_reduction_1bit.sh
The finetune speed of 1bit/2bit models are slower than original model.

Could you guys help to point out some reason for my result? Thanks.

Sep 27 '22 08:09 Toan-Do

I noticed this for XTC and ZeroQuant as well, there are other open issues with the same results. I wonder if there is something that we are missing...

Mar 29 '23 23:03 berserkr

DeepSpeedExamples DeepSpeedExamples copied to clipboard

XTC in DeepSpeed Compression does not work

DeepSpeedExamples
DeepSpeedExamples copied to clipboard