yolov5_prune icon indicating copy to clipboard operation
yolov5_prune copied to clipboard

When 20% pruning, one layer itself becomes zero, is this a problem?

Open YoungjaeDev opened this issue 3 years ago • 7 comments

image

When the sparse rate was 0.00001, the mAP showed a stable trend, but due to time constraints, only 20 epochs were sparsed and straightened. Do you think there's a problem?

YoungjaeDev avatar Jun 21 '22 14:06 YoungjaeDev

You should conduct sufficient sparse training to ensure that there are many BN coefficients that tend to 0, so that this problem will not occur.

uyzhang avatar Jun 21 '22 14:06 uyzhang

@uyzhang

  1. What is the criterion of being sufficient?
  2. And although the map goes up steadily, l1 regularization can reverse the current sparsity calculation? Thank you for your reply

YoungjaeDev avatar Jun 21 '22 15:06 YoungjaeDev

  1. The distribution should be like the following figure fig

2.I don't quite understand what this problem is.

uyzhang avatar Jun 21 '22 16:06 uyzhang

@uyzhang

  1. I'm sorry, but where can I look that picture? Does it save every epoch?
  2. During sparse training, my mAP rose stably from 0.5 to 0.65, for 20 epochs. Then, even if I increase the epoch, it becomes a saturation in the present, so isn't there much difference in BN sparsity Compared to 20 epochs, ?

YoungjaeDev avatar Jun 21 '22 16:06 YoungjaeDev

  1. As you can see from your Tensorboard, this picture represents the distribution of the BN coefficient.
  2. Strangely, when I carry out sparse training, the mAP will drop. I guess your sparseness coefficient is too small to achieve the effect of sparse training. If the sparsity training is successful, it can be seen that most of the BN coefficients in the distribution map tend to 0. So the longer you train, the more coefficients tend to 0.

uyzhang avatar Jun 21 '22 16:06 uyzhang

@uyzhang Thank you for your quick answerI 'll give it a try and tell you In addition, BN bias also does l1 regualization in this repo, and then 10 times... Is there a reason why the origin paper only does the scaling factor?

YoungjaeDev avatar Jun 21 '22 22:06 YoungjaeDev

It's just that it works better.

uyzhang avatar Jun 22 '22 02:06 uyzhang