Torch-Pruning icon indicating copy to clipboard operation
Torch-Pruning copied to clipboard

Something is confused with high level pruners?

Open aidevmin opened this issue 1 year ago • 3 comments

Thanks @VainF for amazing repo.

I read your paper and see some API pruner, but something is confused with pruner.

  1. In this link, you said that BNScalePruner and GroupNormPruner supports sparse training. It means that we need to train pretrained model again with at least 1 epoch. It changes pretrained model parameters. Is that right?

  2. In this benchmark table https://github.com/VainF/Torch-Pruning/tree/master/benchmarks, I saw that some methods implemented by you such as Group-L1, Group-BN, Group-GReg, Ours w/o SL and Ours. As my understanding, all the above methods estimate importance of parameters:

  • All the above methods are group-level and they are different each other by only importance criteria? Is that right?
  1. Are all pruners in your repo group-level? I am confused because when I read the code, for example group-level L1 you used tp.pruner.MagnitudePruner, group-level BN you used tp.pruner.BNScalePruner, these 2 API pruner names are without Group. But for group-level Group pruner you used tp.pruner.GroupNormPruner with Group in the API name. Please correct me.

  2. Your contribution is DepGraph and new pruning method GroupPruner with sparse learning (based on L2 norm)? Is that right? If it is right, so GroupPruner without sparse learning is same as tp.pruner.MagnitudePruner with L2 importance?

  3. As my understanding, tp.pruner.MagnitudePruner is group-level for Conv layers, tp.pruner.BNScalePruner is group-level for BN layers, and tp.pruner.GroupNormPruner for Conv, BN, Linear layers. Is that right?

Sorry for my not good English.

aidevmin avatar Oct 08 '23 08:10 aidevmin

  1. In this link, you said that BNScalePruner and GroupNormPruner supports sparse training. It means that we need to train pretrained model again with at least 1 epoch. It changes pretrained model parameters. Is that right?

Yes, it forces some unimportant parameters to be 0.

  1. In this benchmark table https://github.com/VainF/Torch-Pruning/tree/master/benchmarks, I saw that some methods implemented by you such as Group-L1, Group-BN, Group-GReg, Ours w/o SL and Ours. As my understanding, all the above methods estimate importance of parameters:

Yes.

  1. Are all pruners in your repo group-level? I am confused because when I read the code, for example group-level L1 you used tp.pruner.MagnitudePruner, group-level BN you used tp.pruner.BNScalePruner, these 2 API pruner names are without Group. But for group-level Group pruner you used tp.pruner.GroupNormPruner with Group in the API name. Please correct me.

Yes, all pruner is able to estimate group importance and remove grouped parameters by default.

  1. Your contribution is DepGraph and new pruning method GroupPruner with sparse learning (based on L2 norm)? Is that right? If it is right, so GroupPruner without sparse learning is same as tp.pruner.MagnitudePruner with L2 importance?

Right. Both GroupNormPruner and MagnitudePruner inherent tp.pruner.MetaPruner.The only difference is that GroupNormPruner has an interface for sparse training.

  1. As my understanding, tp.pruner.MagnitudePruner is group-level for Conv layers, tp.pruner.BNScalePruner is group-level for BN layers, and tp.pruner.GroupNormPruner for Conv, BN, Linear layers. Is that right?

Yes.

VainF avatar Oct 09 '23 02:10 VainF

@VainF Thank you so much for quick response. I got it.

aidevmin avatar Oct 09 '23 02:10 aidevmin

@VainF Do you have any recommed for number of epochs with sparse training? If it is large, so it take much time for normal training + sparse training before pruning.

aidevmin avatar Oct 18 '23 04:10 aidevmin