DeepLearnToolbox icon indicating copy to clipboard operation
DeepLearnToolbox copied to clipboard

CNN backpropagation deltas - not sure if calculated correctly

Open dinosg opened this issue 10 years ago • 1 comments

the back propagation algorithms for the CNN methods - as stated in the code for cnnbp, the delta in the output layer is defined, net.od = net.od = net.e .(net.o.(1-net.o));

however, it is not obvious why the relation shouldn't simply be net.od = net.e. At the output layer, no derivatives of the sigmoid function nor chain rule needs to be back propagated. Furthermore, if the output object net.o is stuck in the wrong state, for example, the feature value = 1 while net.o = 0, or vice versa, trying to back propagate an error factor using net.od = net.od = net.e .(net.o.(1-net.o)) always results in net.od = 0, so no corrections can be back-propagated.

By using the relation net.od = net.e I was able to reduce the mser in the test_example_CNN demo example by 1/3 in the first epoch, as well as obtain convergence for more difficult examples where the prediction got stuck in the wrong state.

dinosg avatar Jun 20 '14 18:06 dinosg

Hmm, I'm pretty sure it's correct. The sigm function is applied at the output layer.

Does cnnnumgradcheck work if you change it?

rasmusbergpalm avatar Jul 10 '14 20:07 rasmusbergpalm