Young

Results 7 comments of Young

谢谢回答。 请问新版本的代码在哪?这里你好像没有更新。 我看到的这个代码base_conv第22行 self.eta = np.zeros((shape[0], (shape[1] - ksize + 1) / self.stride, (shape[1] - ksize + 1) / self.stride, 这里第二个shape[1]是否应该是shape[2]呢?

根据softmax的公式,按照我的理解,self.loss += np.log(np.sum(np.exp(prediction[i]))) - prediction[i] 应该是这样的啊,为什么您用的是prediction[i, label[i]]呢?

根据Adam的更新公式,self.data -= learning_rate_t * self.m_t / ((self.v_t ) **0.5+self.epsilon)吧,为什么你的代码里是self.data -= learning_rate_t * self.m_t / (self.v_t + self.epsilon) ** 0.5。我没有找到Adam中关于learning rate更新的公式,作者可以给个参考的文章链接吗?

卷积核翻转的可以理解了,但是你的代码里为什么需要加swapaxes,可以详细解释一下吗? 谢谢了!

x is calculated from Channel Attention Module and module_input is made to equal x, which is used to be a residual in Spatial Attention Module.

I can provide the pretrained weight, but the model I trained did not reach the accuracy mentioned in the paper, even worse than seresnet50.

OK. I will add a comment to each chunk of the code. I am so busy these days.