MTCNN-Tensorflow
MTCNN-Tensorflow copied to clipboard
梯度爆炸问题
用自有数据训练95点,训练P-NET过程中出现梯度爆炸了,请问怎么解决啊 2018-08-19 11:53:56.928519 : Step: 10, accuracy: 0.963731, cls loss: 0.328417, bbox loss: 0.195085, landmark loss: nan,L2 loss: 0.026220,lr:0.010000 2018-08-19 11:53:57.168789 : Step: 20, accuracy: 0.984848, cls loss: 0.157212, bbox loss: 0.123694, landmark loss: nan,L2 loss: 0.026372,lr:0.010000 2018-08-19 11:53:57.416696 : Step: 30, accuracy: 0.981043, cls loss: 0.140952, bbox loss: 0.107430, landmark loss: 11.471462,L2 loss: 0.026815,lr:0.010000 2018-08-19 11:53:57.644162 : Step: 40, accuracy: 0.976415, cls loss: 0.168928, bbox loss: 0.139431, landmark loss: nan,L2 loss: 0.027715,lr:0.010000 2018-08-19 11:53:57.864533 : Step: 50, accuracy: 0.967213, cls loss: 0.257370, bbox loss: 0.876604, landmark loss: 24.074501,L2 loss: 0.032388,lr:0.010000 2018-08-19 11:53:58.072143 : Step: 60, accuracy: 0.014634, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: inf,lr:0.010000 2018-08-19 11:53:58.293608 : Step: 70, accuracy: 0.050000, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.524199 : Step: 80, accuracy: 0.054054, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.761190 : Step: 90, accuracy: 0.021978, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.971244 : Step: 100, accuracy: 0.059113, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.180251 : Step: 110, accuracy: 0.060914, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.388746 : Step: 120, accuracy: 0.054945, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.593905 : Step: 130, accuracy: 0.019608, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.798467 : Step: 140, accuracy: 0.047120, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.001104 : Step: 150, accuracy: 0.049020, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.210982 : Step: 160, accuracy: 0.018868, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.417550 : Step: 170, accuracy: 0.023923, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.627177 : Step: 180, accuracy: 0.026882, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.847200 : Step: 190, accuracy: 0.021622, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:01.064762 : Step: 200, accuracy: 0.043956, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000
减小学习率看看。
loss值后来成指数增长,学习率减小了不能达到效果
@AITTSMD 大神请指教一下啊
@SunshineJZJ 请问增点训练landmark梯度爆炸这个问题您是怎么解决的?
我遇到的问题是landmark loss总是在15左右徘徊,不知道为什么?有人能解答这个问题么?我不知道该@谁
用自有数据训练95点,训练P-NET过程中出现梯度爆炸了,请问怎么解决啊 2018-08-19 11:53:56.928519 : Step: 10, accuracy: 0.963731, cls loss: 0.328417, bbox loss: 0.195085, landmark loss: nan,L2 loss: 0.026220,lr:0.010000 2018-08-19 11:53:57.168789 : Step: 20, accuracy: 0.984848, cls loss: 0.157212, bbox loss: 0.123694, landmark loss: nan,L2 loss: 0.026372,lr:0.010000 2018-08-19 11:53:57.416696 : Step: 30, accuracy: 0.981043, cls loss: 0.140952, bbox loss: 0.107430, landmark loss: 11.471462,L2 loss: 0.026815,lr:0.010000 2018-08-19 11:53:57.644162 : Step: 40, accuracy: 0.976415, cls loss: 0.168928, bbox loss: 0.139431, landmark loss: nan,L2 loss: 0.027715,lr:0.010000 2018-08-19 11:53:57.864533 : Step: 50, accuracy: 0.967213, cls loss: 0.257370, bbox loss: 0.876604, landmark loss: 24.074501,L2 loss: 0.032388,lr:0.010000 2018-08-19 11:53:58.072143 : Step: 60, accuracy: 0.014634, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: inf,lr:0.010000 2018-08-19 11:53:58.293608 : Step: 70, accuracy: 0.050000, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.524199 : Step: 80, accuracy: 0.054054, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.761190 : Step: 90, accuracy: 0.021978, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:58.971244 : Step: 100, accuracy: 0.059113, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.180251 : Step: 110, accuracy: 0.060914, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.388746 : Step: 120, accuracy: 0.054945, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.593905 : Step: 130, accuracy: 0.019608, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:53:59.798467 : Step: 140, accuracy: 0.047120, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.001104 : Step: 150, accuracy: 0.049020, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.210982 : Step: 160, accuracy: 0.018868, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.417550 : Step: 170, accuracy: 0.023923, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.627177 : Step: 180, accuracy: 0.026882, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:00.847200 : Step: 190, accuracy: 0.021622, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000 2018-08-19 11:54:01.064762 : Step: 200, accuracy: 0.043956, cls loss: nan, bbox loss: nan, landmark loss: nan,L2 loss: nan,lr:0.010000
不知道你之前的激活函数使用了PReLU了没有以及最后的输出层不要用激活函数,直接做线性回归. 网络很浅正常来说不太应该出现这种梯度爆炸问题,应该是因为中间的激活函数没有使用好或者最后一层用了激活函数