neural-network-from-scratch
neural-network-from-scratch copied to clipboard
Implementing Multiple Layer Neural Network from Scratch
In the backpropagation part, the first line code writes: dtanh = softmaxOutput.diff(forward[len(forward)-1][2], y) So it is activated then sent to softmax? I guess for the last layer there is no...
i think this is right! def backward(self, X, top_diff): output = self.forward(X) return (1.0 + np.square(output)) * top_diff beacuse (tanx)'=sec²x=1+tan²x
Very confusion. I search a lot about BP algorithm, Some notes says it is ok only to differential w.r.t. W(parameter) and use residual to get gradient ? Your example seems...