Add snip_momentum structured pruning which supports higher sparse ratio
This PR is used to contribute snip_momentum pruning algorithm in Intel Neural Compress to DeepSpeed compression like we proposed in RFC.
The snip_momentum algo implements the algorithm described in here.
We tested it on DeepSpeedExamples/compression/bert with a newly added script bash_script/pruning_sparse_snip_momentum.sh and get below results. The changes in examples is here
| pattern | sparsity ratio | pruning method | epochs | acc & mm-acc |
|---|---|---|---|---|
| 1x1 | 80% | DeepSpeed L1 | 2 | 0.8113/0.822 |
| 1x1 | 80% | Snip_momentum | 2 | 0.8176/0.822 |
| 4x1 | 80% | snip_momentum | 10 | 0.8248/0.8305 |
cc @hshen14 @wenhuach21
@microsoft-github-policy-service agree company="Intel"
Due to different algorithms may not share the same best hyperparameter, we have tried others. The main difference is we only use the second to last layer for distillation and change the lr.
| pattern | sparsity ratio | pruning method | epochs | acc & mm-acc |
|---|---|---|---|---|
| 4x1 | 80% | Snip_momentum | 2 | 0.8284/0.8388 |
| 4x1 | 80% | Snip_momentum | 6 | 0.8339/0.8418 |
tested the accuracy and looks great.
@ftian1, there is a formatting issue on the PR. The pre-commit needs to be run and the file changes committed to the branch. In particular, the following needs to be run on the repo:
pre-commit run --all-files
@xiaoxiawu-microsoft sorry for the late response due to PRC holiday and thanks for your review.
I have fixed the yapf scan issue. but in my local, the detection of destroyed symlinks always fail after merge master. not sure why it happens as everything looks good. so I push the code at first. Hope it will not waste pre-ci resources.
@xiaoxiawu-microsoft Those pre-ci errors are not related with my changes, could you pls have a check?