Pytorch implementation of various neural network interpretability methods

Interpretability of Neural Networks

Project overview

Pytorch implementation of various neural network interpretability methods and how they can interpret uncertainty awareness models.

The main implementation can be found in the nn_interpretability package. We also provide every method an accompanied Jupyter Notebook to demonstrate how we can use the nn_interpretability package in practice. Some of the methods are showcased together in one notebook for better comparison. Next to the interpretability functionality, we have defined a repository for models we trained and additional functionality for loading and visualizing data and the results from the interpretability methods. Furthermore, we have implemented uncertainty techniques to observe the behavior of interpretability methods under stochastical settings.

The main deliverable of this repository is the package nn_interpretability, which entails every implementation of a NN interpretability method that we have done as part of the course. It can be installed and used as a library in any project. In order to install it one should clone this repository and execute the following command:

pip install -e .

After that, the package can be used anywhere by importing it:

import nn_interpretability as nni

An example usage of a particular interpretability method can be found in the corresponding Jupyter Notebook as outlined below. We also prepared a general demonstration of the developed package in this Jupyter Notebook.

Image classified as tandem bicycle by pretrained VGG16
LRP Composite
Guided Backpropagation
DeepLIFT RevealCancel

Note: The package assume that layers of the model are constructed inside containers(e.g. features and classifier). This setting is due to the structure of the pretrained models from the model zoo. You could use torch.nn.Sequential or torch.nn.ModuleList to achieve this on your own model.

Interpretability methods

1. Model-based approaches

  • Activation Maximization
    • General Activation Maximization [1]
    • Activation Maximization in Codespace (GAN) [1]
    • Activation Maximization in Codespace (DCGAN) [1]
  • DeepDream [2][13]

2. Decision-based approaches

  • Saliency Map [4]
  • DeConvNet
    • Full Input Reconstruction [5]
    • Partial Input Reconstruction [5]
  • Occlusion Sensitivity [5]
  • Backpropagation
    • Vallina Backpropagation [4]
    • Guided Backpropagation [6]
    • Integrated Gradients [7]
    • SmoothGrad [9]
  • Taylor Decomposition
    • Simple Taylor Decomposition [1]
    • Deep Taylor Decomposition [1][8]
  • LRP
    • LRP-0 [8]
    • LRP-epsilon [8]
    • LRP-gamma [8]
    • LRP-ab [1][8]
  • DeepLIFT
    • DeepLIFT Rescale [12]
    • DeepLIFT Linear [12]
    • DeepLIFT RevealCancel [12]
  • CAM
    • Class Activation Map (CAM) [10]
    • Gradient-Weighted Class Activation Map (Grad-CAM) [11]

3. Uncertainty

  • Monte Carlo Dropout
    • Monte Carlo Dropout Analysis [14]
    • Uncertainty interpretability with LRP
  • Evidential Deep Learning
    • Evidential Deep Learning Anaylsis [15]
    • Base Model vs. Evidential Deep Learning Model with LRP
  • Uncertain DeepLIFT
    • DeepLIFT Deterministic vs. Stochastic Model
    • DeepLIFT Random Noise
    • Temperature scaling [16]


