XNOR-Net-PyTorch about your gradient

about your gradient

Open BobxmuMa opened this issue 3 years ago • 2 comments

首先，非常感谢您开源了您的XNOR-pytorch代码。其次，我注意到您在更新单精度权重时，对于权重的梯度乘了一些系数： self.target_modules[index].grad.data = m.add(m_add).mul(1.0-1.0/s[1]).mul(n) self.target_modules[index].grad.data = self.target_modules[index].grad.data.mul(1e+9) 关于这些系数，我没有在原文中找到相应的描述，想问一下您为什么对梯度进行了这样的变换。

Sep 19 '21 07:09 BobxmuMa

我也想知道，楼主如果明白了，麻烦给讲解一下，谢谢

Oct 18 '21 09:10 zhaoxiangshun

Hi @BobxmuMa @zhaoxiangshun , this parameter 1e+9 appears in the paper author's initial repo and, therefore, I also kept it. The main effect of this parameter is to increase the range of the weights and reduce the effect of weight decay. I suppose using a much smaller weight decay value will have the same effect. I also tested the accuracy with and without this parameter. In my tests, I saw a higher accuracy if using this parameter.

Nov 15 '21 20:11 jiecaoyu

XNOR-Net-PyTorch XNOR-Net-PyTorch copied to clipboard

about your gradient

XNOR-Net-PyTorch
XNOR-Net-PyTorch copied to clipboard