CS224n icon indicating copy to clipboard operation
CS224n copied to clipboard

Assignment 1 q2_neural.py softmax gradient not explicitly calculated

Open rshamsy opened this issue 6 years ago • 1 comments

In calculating gradients, the gradient of the softmax function is not calculated using the formula that is derived in the lecture notes. It seems like in the code, this step is skipped over, and the gradient of the cost function with respect to yhat is used only ('d3' variable). Am I missing something here?

rshamsy avatar Feb 13 '19 05:02 rshamsy

I found the codes in Backprop is nothing wrong. Actually, binary classification via cross entropy with softmax has a very simple derivative formula, which is yhat - labels. You can find more details in https://deepnotes.io/softmax-crossentropy

Hope this would help you.

Spico197 avatar Jul 25 '19 15:07 Spico197