DeepLearnToolbox
DeepLearnToolbox copied to clipboard
CNN backpropagation deltas - not sure if calculated correctly
the back propagation algorithms for the CNN methods - as stated in the code for cnnbp, the delta in the output layer is defined, net.od = net.od = net.e .(net.o.(1-net.o));
however, it is not obvious why the relation shouldn't simply be net.od = net.e. At the output layer, no derivatives of the sigmoid function nor chain rule needs to be back propagated. Furthermore, if the output object net.o is stuck in the wrong state, for example, the feature value = 1 while net.o = 0, or vice versa, trying to back propagate an error factor using net.od = net.od = net.e .(net.o.(1-net.o)) always results in net.od = 0, so no corrections can be back-propagated.
By using the relation net.od = net.e I was able to reduce the mser in the test_example_CNN demo example by 1/3 in the first epoch, as well as obtain convergence for more difficult examples where the prediction got stuck in the wrong state.
Hmm, I'm pretty sure it's correct. The sigm function is applied at the output layer.
Does cnnnumgradcheck work if you change it?