DenseNetCaffe icon indicating copy to clipboard operation
DenseNetCaffe copied to clipboard

The loss equaled to 87.3365 during the training stage and didn't change

Open cengzy14 opened this issue 7 years ago • 5 comments

I followed the instruction and didn't change the settings in solver.prototxt, but the loss converged to 87.3365 soon. It's said that this is because the learning rate is too large and the feature before the softmax layer equals to inf. So I am wondering what settings should I use with this network. Thanks a lot!

cengzy14 avatar Oct 31 '17 03:10 cengzy14

Maybe first try a lower learning rate. If it doesn't work, see if other network architectures don't work either. Then make the decision on whether to use this network, or see if there's some bugs elsewhere.

liuzhuang13 avatar Oct 31 '17 12:10 liuzhuang13

@cengzy14 , same problem, have you found any solutions?

zhaofenqiang avatar Nov 13 '17 03:11 zhaofenqiang

@zhaofenqiang 出现87.3365的原因是softmax之前的特征层出现的inf或者nan 如果是第一次测试时出现87.3365,原因是BN层方差初始为0,而eps初始为1e-5,除以根号下 eps导致feature map中数值越来越大,出现inf,所以第一次测试时一定会出现87.6635 如果是训练时出现可能是pooling层的stride不能整除输入feature map的尺寸,导致出现nan 解决这两个问题后我的accuracy=0了,最后也没有解决,于是就换了https://github.com/shicai/DenseNet-Caffe 里面提供的在imagenet上训练的代码和model

cengzy14 avatar Nov 13 '17 04:11 cengzy14

@cengzy14 谢谢,很有帮助~

zhaofenqiang avatar Nov 15 '17 08:11 zhaofenqiang

@cengzy14 请问你微调的时候网络收敛快么

wjzh1 avatar Apr 04 '18 06:04 wjzh1