Integrated-Gradients
Integrated-Gradients copied to clipboard
How would you justify negative attributions?
Hi Ankur,
Thank you for the excellent job of integrated gradient! It provides a great guideline for exploring what the neural network is doing. Can I ask whether there is any justification for negative attributions? Or should we just interpret that as a smaller attribution. Because it's not that intuitive seeing negative attribution in LSTMs.
E.g. Given Attribution_1 = -1, Attribution_2 = 1, can we naively suggest that Attribution_2 brings more impact to the final result?
Best, Yijun
Negative attribution usually means that removing that pixel would increase the probability of that class, while positive attribution means that removing that pixel decreases the probability of that class. Does that help?
I am wondering why there are often large positive and large negative pixels next to each other, maybe someone has a thought on that?
Hi guys, do you know what is "that class" by default? For example, in a binary classification problem with class names 0 and 1, how should I interprete a negative attribution?
By "that class" I mean the class that is currently being explained, for which I usually use the predicted class of the example.