Synaptic-Flow
Synaptic-Flow copied to clipboard
All gradients are zero for synflow
I was using synflow for my resnet like model for a classification problem. I found that all gradients of the masked_parameters are zeros. Any ideas why it happens?
self.scores[id(p)] = torch.clone(p.grad * p).detach().abs_()
p is not zero but p.grad is always zero. Thanks!
Hi, I have the same problem as you. Do you have any idea why it happens?
Since there seems to be no reaction from the authors, I thought I might chime in. While I did not use the code from this repository, I had a similar issue for ResNet Models, where the gradients in all the residual blocks were zero up until high pruning ratios >~= .9.
What solved the problem for me was switching from torchvision / timm models to the slightly modified implementations proposed in this paper by Alizadeh et al. (2022), which describes the differences in the implementation in appendix A1.
It appears to be some issue related to the smaller size of input images when not using ImageNet, but I have not looked into it in too much detail. The reference implementations from the paper can be obtained from this repo.