NeuralNetworkFromScratch
NeuralNetworkFromScratch copied to clipboard
Fixed formally wrong backpropagation/gradient descent
The calculation of the gradient is missing the derivative of the sigmoid function for the outer layer and the weights of the hidden layer were updated in a way that affects the update of the weight of the input layer. This is not formally correct (though I imagine the latter probably works in a lot of cases if you do small enough steps). The former probably only works because the sigmoid function is strictly increasing.