Based on https://github.com/TeamHG-Memex/eli5/pull/315 and https://github.com/TeamHG-Memex/eli5/pull/329, we use Grad-CAM to highlight parts of text that contribute to a prediction of a Keras classifier.

For example, we can use a call similar to this:

eli5.show_prediction(model, doc, 
        tokens=doc_tokens,
        pad_x=word_index['<PAD>'],
        padding='post',
        relu=False,
)

To roughly highlight positive and negative parts of a text (explaining the score given in sentiment analysis): Screenshot from 2019-07-17 18-13-25

This PR also makes refactorings (including changes to the public API) to the code at the linked PR's.

WIP items:

~~Tutorial.~~
~~Pass CI and coverage.~~
~~Tests (manual and automated).~~
~~Docs.~~
~~Mypy.~~

Jul 17 '19 17:07 teabolt

Codecov Report

Merging #325 into master will decrease coverage by 0.09%. The diff coverage is 96.83%.

@@            Coverage Diff            @@
##           master     #325     +/-   ##
=========================================
- Coverage   97.31%   97.22%   -0.1%     
=========================================
  Files          49       52      +3     
  Lines        3134     3318    +184     
  Branches      584      624     +40     
=========================================
+ Hits         3050     3226    +176     
- Misses         44       48      +4     
- Partials       40       44      +4

Impacted Files	Coverage Δ
eli5/keras/__init__.py	`100% <ø> (ø)`	:arrow_up:
eli5/nn/__init__.py	`100% <100%> (ø)`
eli5/keras/explain_prediction.py	`98.37% <100%> (+1.98%)`	:arrow_up:
eli5/keras/gradcam.py	`100% <100%> (ø)`	:arrow_up:
eli5/nn/gradcam.py	`100% <100%> (ø)`
eli5/base.py	`100% <100%> (ø)`	:arrow_up:
eli5/nn/text.py	`92.45% <92.45%> (ø)`
eli5/formatters/image.py	`98.71% <93.33%> (-1.29%)`	:arrow_down:
eli5/sklearn/permutation_importance.py	`100% <0%> (ø)`	:arrow_up:
... and 4 more

Jul 17 '19 19:07 codecov-io

OK github (or me) indeed screwed up some inline comments, let me remove and re-add them.

Aug 14 '19 09:08 lopuhin

Re-posting removed inline comments:

regarding padding argument: looks like we can strip pad_value both from the start and the end, thus removing padding argument from the API.
regarding pad_value as number or string: I think it would be more clear to separate them in the API, e.g. have pad_value to be a number (token id), and pad_token to be a string (token used for padding).
regarding the model used for multi-class classification: if there is a notebook/script where we trained the model, would be great to link it here.

Aug 14 '19 09:08 lopuhin

Some comments regarding the text tutorial (GH again places them on the wrong lines):

for layer in binary_model.layers:
    print(layer.name, layer.output_shape)
    if isinstance(layer, desired):
        ...

I'd suggest printing only the layers that we explain, else it's hard to understand the output below without checking the code in more detail.

Aug 19 '19 10:08 lopuhin

Regarding the text tutorial

It looks like <PAD> had some effect on the explanation.

I think it would be great to clarify that we are removing padding prior to passing the input to the model, so most likely it's the model which gives different results with and without padding, right? EDIT: sorry looks I'm wrong here

Aug 19 '19 10:08 lopuhin

Just finished the text tutorial, looks great 👍 One more comment regarding the note at the end

In general, this is experimental work. Unlike for images, this is not based on any paper.

I think that's a great note, what to you think of moving it to the start of the tutorial?

Aug 19 '19 10:08 lopuhin

I think it would be great to clarify that we are removing padding prior to passing the input to the model, so most likely it's the model which gives different results with and without padding, right?

Here I think it's the model that gives different results with and without padding. But in general we remove padding (using the pad_token or pad_value argument) after passing the input to the model.

Aug 19 '19 16:08 teabolt

Great, thanks for clarification @teabolt 👍

Aug 19 '19 17:08 lopuhin

Oh my gosh I want this please merge this as soon as possible 🥰

Sep 08 '19 21:09 Hellisotherpeople

@lopuhin Do you think there is anything major left to do in this PR? (besides making the CI pass)

Sep 10 '19 22:09 teabolt

@teabolt I don't recall anything significant, and from a quick glance it looks like all review feedback is addressed so I think it's almost ready.

Sep 17 '19 20:09 lopuhin

Codecov Report

Merging #325 into master will increase coverage by 0.02%. The diff coverage is 98.20%.

@@            Coverage Diff             @@
##           master     #325      +/-   ##
==========================================
+ Coverage   97.32%   97.34%   +0.02%     
==========================================
  Files          49       52       +3     
  Lines        3142     3320     +178     
  Branches      585      623      +38     
==========================================
+ Hits         3058     3232     +174     
- Misses         44       46       +2     
- Partials       40       42       +2

Impacted Files	Coverage Δ
eli5/ipython.py	`100.00% <ø> (ø)`
eli5/keras/__init__.py	`100.00% <ø> (ø)`
eli5/formatters/image.py	`98.71% <93.33%> (-1.29%)`	:arrow_down:
eli5/nn/text.py	`96.00% <96.00%> (ø)`
eli5/base.py	`100.00% <100.00%> (ø)`
eli5/keras/explain_prediction.py	`98.37% <100.00%> (+1.98%)`	:arrow_up:
eli5/keras/gradcam.py	`100.00% <100.00%> (ø)`
eli5/nn/__init__.py	`100.00% <100.00%> (ø)`
eli5/nn/gradcam.py	`100.00% <100.00%> (ø)`
... and 4 more

Jun 06 '20 12:06 codecov-commenter

Is this ever going to be merged? I'm super interested in seeing this get added to ELI5 ASAP!!!

Dec 21 '20 02:12 Hellisotherpeople

Is this ever going to be merged? I'm super interested in seeing this get added to ELI5 ASAP!!!

Hey @Hellisotherpeople . Sorry for the late reply. If you are still interested you can try installing the PR branch directly with pip:

If you are using virtualenv or a similar environment manager, make a clone of your environment.
Install the PR branch with pip install -U git+https://github.com/teabolt/eli5.git@keras-gradcam-text

To test that it worked run:

import eli5.keras
eli5.keras.explain_prediction_keras_text()

The output should be TypeError: explain_prediction_keras_text() missing 2 required positional arguments: 'model' and 'doc'

Tested this with Python 3.7 and Ubuntu 20.04.

Jan 11 '21 15:01 teabolt

eli5
eli5 copied to clipboard

Explain text predictions of Keras classifiers

Codecov Report

Codecov Report

eli5 eli5 copied to clipboard

Explain text predictions of Keras classifiers

Codecov Report

Codecov Report

eli5
eli5 copied to clipboard