Improve Explanations: Better explanation visualization, more Explanation methods
Improved Explanation Visualization
Currently, we use a simple color density approach to visualize importance. Early feedback suggests this is helpful for the user immediately see the most important words/tokens, but does not offer quantities or further interaction (e.g. top n). Can we make this better or provide better alternatives?
- [ ] convert explanations to a single modal view: switcher between visualization ntypes
- [ ] bar + density visualization for easier comparisons similar to what was done here?
- [ ] Top n important words: show only highlights for the top x most important words?
More Explanation Methods
Currently, explanations are based on vanilla gradients. We might want to explore:
- [ ] Integrated gradients - maybe explore
- [ ] SmoothGrad -
- [ ] GradCam maybe?
Useful Resources
-
https://www.tensorflow.org/tutorials/interpretability/integrated_gradients
-
See https://github.com/experiencor/deep-viz-keras
-
https://keras.io/examples/vision/grad_cam/
-
https://blog.fastforwardlabs.com/2020/06/22/how-to-explain-huggingface-bert-for-question-answering-nlp-models-with-tf-2.0.html
-
[ ] add this as a config.yaml option, and front end toggle too?