shap Shap plot predicts the opposite label than the classifier predicted

I trained a custom model using HF for 2-class classification. I got the prediction from the classifier as

[{'label': 'LABEL_0', 'score': 0.9944180250167847},
  {'label': 'LABEL_1', 'score': 0.005582042038440704}]]

which says label_0 is the correct label but when I visualize the sentence using the following code, I get the following output. What I believe the model is saying is that label_1 is more likely to be the label. The base value is from the object is

.base_values =
array([[0.00245374, 0.99754626]])

from transformers import Trainer
import torch

transformers.__version__
- 4.17.0
shap.__version__
- 0.40.0
# Model and Explainer initialisation


# load a transformers pipeline model
model = transformers.pipeline(task="text-classification",
                              tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base"),
                              model='/content/drive/MyDrive/best_model_binary-original_2-class/', return_all_scores=True, device=0)

# explain the model on two sample inputs
explainer = shap.Explainer(model) 
shap.plots.text(shap_values[0, :,"LABEL_0"])

Mar 09 '22 18:03 thak123

I have the same problem with the same configuration. Although the difference between values is not as extreme, I obtain an f_label that is different from the prediction right out of the pipeline. Since the value is really different and not merely reversed I think our problems are still linked somehow Have you found anything about solving this ?

Jul 04 '22 13:07 nk-fouque

Nope. I parked the exploration as for every case it was either predicting the correct label or not. yes, it's not reversed.

I haven't tried the latest version. Maybe it's fixed ...dunno. I will try it this weekend.

If you manage to fix the issue kindly comment back.

Jul 04 '22 13:07 thak123

@thak123 I have found that when using something other than the pure tokenizer as a masker, I get the correct values can you try this and tell me if it does it for your problem too since it was more extreme than mine ?

masker = shap.maskers.Text(r"[\s]",collapse_mask_token=False)
explainer = shap.Explainer(model, masker)

This is a masker that only separates on spaces, the collapse_mask_token parameter makes it so that it becomes several masked tokens if the actual tokenizer splits it and doesn't change the length of the sequence

Jul 06 '22 09:07 nk-fouque

@nk-fouque I will try it in the evening and let you know.

Jul 06 '22 09:07 thak123

@nk-fouque Yes, you are correct. I am getting correct predictions.

Jul 07 '22 17:07 thak123

Alright now to explain why that happens and how to preserve this behavior with the regular tokenizer I have no clue. Any insights from @slundberg ?

Jul 08 '22 08:07 nk-fouque

I had the same issue. And I tried what you @nk-fouque mentioned in https://github.com/slundberg/shap/issues/2425#issuecomment-1176007354, but the results are still different with the original model. I also noticed that for some sentences(like some simple or short sentences) this issue does not happen. Do you have any suggestions?

Jun 27 '23 16:06 hwq0726

shap shap copied to clipboard

Shap plot predicts the opposite label than the classifier predicted

shap
shap copied to clipboard