Synaptic-Flow All gradients are zero for synflow

All gradients are zero for synflow

Open jackmvision opened this issue 3 years ago • 2 comments

I was using synflow for my resnet like model for a classification problem. I found that all gradients of the masked_parameters are zeros. Any ideas why it happens?

self.scores[id(p)] = torch.clone(p.grad * p).detach().abs_()

p is not zero but p.grad is always zero. Thanks!

Feb 18 '22 00:02 jackmvision

Hi, I have the same problem as you. Do you have any idea why it happens?

May 10 '22 06:05 Nobreakfast

Since there seems to be no reaction from the authors, I thought I might chime in. While I did not use the code from this repository, I had a similar issue for ResNet Models, where the gradients in all the residual blocks were zero up until high pruning ratios >~= .9.

What solved the problem for me was switching from torchvision / timm models to the slightly modified implementations proposed in this paper by Alizadeh et al. (2022), which describes the differences in the implementation in appendix A1.

It appears to be some issue related to the smaller size of input images when not using ImageNet, but I have not looked into it in too much detail. The reference implementations from the paper can be obtained from this repo.

Feb 13 '23 09:02 DavidSchischke

Synaptic-Flow Synaptic-Flow copied to clipboard

All gradients are zero for synflow

Synaptic-Flow
Synaptic-Flow copied to clipboard