alibi
alibi copied to clipboard
Integrated gradients on raw text PoC
This is a WIP PR addressing #289.
For applying integrated gradients on raw text, I've written a new class IntegratedGradientsText
which takes an additional object of type TextTransformer
. The TextTransformer
object must define two methods:
-
texts_to_tokens
- given a list of strings return a list of lists of string tokens (i.e. a ragged array). This is necessary to output attributions for each token. -
texts_to_array
- given a list of strings return a homogenous numpy array ready for model consumption. This is necessary to prepare the raw text for the model.
Thus, the user has to inherit from TextTransformer
and implement these two methods for a custom use case. I have also provided an implementation KerasTextTransformer
which takes a Keras Tokenizer
and a parameter maxlen
(to define the array dimension) - see example.
Main things to discuss:
- In the implementation I assume that the attributions are calculated with respect to the embedding layer, hence I've hardcoded a sum over the embedding axis to produce attributions on the token level. This may be brittle and fail if other layers are to be explained. As a minimum we should provide copious documentation of how to use IG for text (or even ask if we should only support attributions for the embedding layer?)
- The baselines are with respect to the layer that is being explained, so in the example if custom baselines are used, these should have the same shape as the layer explained.
- It is not clear how to name the returned attributions. Currently
attributions
refer to the raw attributions for the layer explained and have the same shape. I have named the summed attributionsattrs
which is a list of arrays of floats (i.e. a ragged array, one for each instance) of the same size astokens
which are also returned (list of lists of string tokens, i.e. a ragged array). We could also apply the same reasoning tobaselines
and return a summed version as a ragged array or even allow a ragged array to be taken as an input baseline toexplain
.
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB