nn-from-scratch
nn-from-scratch copied to clipboard
gradient checks do not match
I checked the gradients you derived against the numerical gradients, and your implementation does not match. It looks like the error is in two places:
-
In
calculate_loss
, you average the total loss (including the regularization term) over the data batch. The correct implementation should average only the log loss, but not the regularization term. -
In
build_model
, the gradients (dW1, dW2, db1, db2
) during backprop should be averaged over the data batch. Again, the correct implementation should not include the regularization terms in the average over the data batch.
Do you have or know a better implementation? Can you explain or show to me how you checked it?
@uripeled2 I have a method for gradient checking in my implementation here: https://github.com/vuptran/introduction-to-neural-networks