XNOR-Net-PyTorch icon indicating copy to clipboard operation
XNOR-Net-PyTorch copied to clipboard

about your gradient

Open BobxmuMa opened this issue 3 years ago • 2 comments

首先,非常感谢您开源了您的XNOR-pytorch代码。其次,我注意到您在更新单精度权重时,对于权重的梯度乘了一些系数: self.target_modules[index].grad.data = m.add(m_add).mul(1.0-1.0/s[1]).mul(n) self.target_modules[index].grad.data = self.target_modules[index].grad.data.mul(1e+9) 关于这些系数,我没有在原文中找到相应的描述,想问一下您为什么对梯度进行了这样的变换。

BobxmuMa avatar Sep 19 '21 07:09 BobxmuMa

我也想知道,楼主如果明白了,麻烦给讲解一下,谢谢

zhaoxiangshun avatar Oct 18 '21 09:10 zhaoxiangshun

Hi @BobxmuMa @zhaoxiangshun , this parameter 1e+9 appears in the paper author's initial repo and, therefore, I also kept it. The main effect of this parameter is to increase the range of the weights and reduce the effect of weight decay. I suppose using a much smaller weight decay value will have the same effect. I also tested the accuracy with and without this parameter. In my tests, I saw a higher accuracy if using this parameter.

jiecaoyu avatar Nov 15 '21 20:11 jiecaoyu