transformers-interpret Max class probability too low with a multi-class classifier

trafficstars

Hi @cdpierse , very nice project, congrats!

I am doing some experiments for bias in sentiment classifiers using our tool Rubrix together with transformers-interpret and I have encountered an issue.

I am using the following sentiment pipeline:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers_interpret import SequenceClassificationExplainer

model_name = "cardiffnlp/twitter-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

cls_explainer = SequenceClassificationExplainer(model, tokenizer)

And one of the weird examples is:

word_attributions = cls_explainer("This woman is a secretary.")

I have three labels, and the model predicts LABEL_0 (negative) but the probability is too low (0.14) assuming the model is multi-class and not multilabel. Using the model widget from Hugging (https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment?text=This+woman+is+a+secretary.) I get 0.57 probability. Maybe I'm missing something when creating the SequenceClassifcationExplainer or loading the model.

Lastly, would it be possible to get all predicted labels and probabilities not only the max probability label?

Keep up the good work, and let me if I can contribute a fix/enhancement if needed

Sep 13 '21 10:09 dvsrepo

Hi @dvsrepo,

Thanks for pointing this out, this is an issue I've kind of known about for a while but haven't had the chance to take a look at it. I'm 90% sure it's as a result of this line: https://github.com/cdpierse/transformers-interpret/blob/615e82f7463c13537f3ae48ea4983c5380952fae/transformers_interpret/explainers/sequence_classification.py#L200 What's going on here is that I set the value self.pred_probs which like the name suggest is the predicted labels probability, this attribute is what is put on display in the visualize() method. The problem with the way I've done this however is that this value is set inside _forward() which is called many times throughout the attribution calculation while the baseline is being integrated into the actual results. Sometimes it just so happens that in the last step the integration was near perfect and the probability is the same, but other times like in your case it's off.

What I need to do to fix this is either to trigger this value to only be set once or set it somewhere else entirely.

Sep 13 '21 20:09 cdpierse

Thanks for the quick response Charles!

Yes, that seems to be the root of the issue, what we do in biome-text, where we use captum as well is do a standard forward pass first, keep the predictions and then do the integration, but I'd understand this might not fit your design and purpose. In my experiment, I log the 800 examples of my sentiment dataset into Rubrix and this issue (too low probability) happens to most of the examples predicted as negative (label_0).

Let me know if I can help in any way

El lun., 13 sept. 2021 22:51, Charles @.***> escribió:

Hi @dvsrepo https://github.com/dvsrepo,

Thanks for pointing this out, this is an issue I've kind of known about for a while but haven't had the chance to take a look at it. I'm 90% sure it's as a result of this line:

https://github.com/cdpierse/transformers-interpret/blob/615e82f7463c13537f3ae48ea4983c5380952fae/transformers_interpret/explainers/sequence_classification.py#L200 What's going on here is that I set the value self.pred_probs which like the name suggest is the predicted labels probability, this attribute is what is put on display in the visualize() method. The problem with the way I've done this however is that this value is set inside _forward() which is called many times throughout the attribution calculation while the baseline is being integrated into the actual results. Sometimes it just so happens that in the last step the integration was near perfect and the probability is the same, but other times like in your case it's off.

What I need to do to fix this is either to trigger this value to only be set once or set it somewhere else entirely.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cdpierse/transformers-interpret/issues/65#issuecomment-918566089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIOJJ2ZJWRMSZ4P4NJ574DUBZP6HANCNFSM5D5POH2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Sep 14 '21 07:09 dvsrepo

@dvsrepo quick update on this. I've been working away on this and have implemented a logic to not set self.pred_probs after the initial call. However, it appears this wasn't the root of the issue and I now suspect that this is a roberta architecture specific problem, primarily that I didn't pass token_type_ids as an input for the attributions calculation. I think I will need to do some further work to get it to play nice with roberta models but I should be able to do it once I have figured out the slight differences in how roberta based models store all their embeddings in HF.

Sep 16 '21 09:09 cdpierse

Thanks so much Charles! Thanks for looking into it. I guess it might also be a reason for not getting very meaningful token attributions for that model.

Daniel Vila Suero CEO & Co-founder

Tel: +34 679120675

www.recogn.ai

http://www.recogn.ai/ [image: Recognai Twitter] https://twitter.com/recogn_ai [image: Recognai Linkedin] https://www.linkedin.com/company/recognai/

On Thu, 16 Sept 2021 at 11:19, Charles @.***> wrote:

@dvsrepo https://github.com/dvsrepo quick update on this. I've been working away on this and have implemented a logic to not set self.pred_probs after the initial call. However, it appears this wasn't the root of the issue and I now suspect that this is a roberta architecture specific problem, primarily that I didn't pass token_type_ids as an input for the attributions calculation. I think I will need to do some further work to get it to play nice with roberta models but I should be able to do it once I have figured out the slight differences in how roberta based models store all their embeddings in HF.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cdpierse/transformers-interpret/issues/65#issuecomment-920733216, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIOJJ6PLQ5K5CWFC3VE6S3UCGZBBANCNFSM5D5POH2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Sep 16 '21 09:09 dvsrepo

I had a similar problem, but with MultilabelClassificationExplainer that extends SequenceClassificationExplainer. The predicted probs computed in https://github.com/cdpierse/transformers-interpret/blob/master/transformers_interpret/explainers/multilabel_classification.py#L133 were not correct, while the pipeline of HuggingFace gets the right ones.

The issue seems to be caused by RoBERTa models, due to a mismatch in the position_ids passed to the forward method. In my case, I finetuned RoBERTa without passing position_ids, letting the model to compute them automatically (https://github.com/huggingface/transformers/blob/main/src/transformers/models/roberta/modeling_roberta.py#L102).

Problem solved by checking in SequenceClassificationExplainer if the model is a RoBERTa one, and set self.accepts_position_ids=False if so:

class SequenceClassificationExplainer(BaseExplainer):
    def __init__(
        self,
        model: PreTrainedModel,
        tokenizer: PreTrainedTokenizer,
        attribution_type: str = "lig",
        custom_labels: Optional[List[str]] = None,
    ):
        super().__init__(model, tokenizer)

        if model._get_name().startswith("Roberta"):
            self.accepts_position_ids = False
        ...

It is quite dirty and likely there are better places to check this kind of variations among models, but it works in my case :)

Jun 24 '22 15:06 jogonba2

I'm having the same problem, using the HuggingFace TextClassificationPipeline(model=self.model, tokenizer=self.tokenizer, return_all_scores=True) on a multilabel text classification setting (xlm-roberta model):

model = bert_model.model  # AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = bert_model.tokenizer  # AutoTokenizer.from_pretrained(model_name)

cls_explainer = MultiLabelClassificationExplainer(model, tokenizer)

The "Prediction Score" is very different from the scores returned by the classification pipeline.

I tried to apply the 'patch' proposed by @jogonba2:

if "roberta" in model._get_name().lower():
            self.accepts_position_ids = False

But the difference is still significant.

Are there any suggestion to fix this ?

Thanks

Jul 28 '22 15:07 databill86

I'm using a Robeta (CamemBERT) fine-tuned model and I get the same problem as described as above: predicted label in attribution visualization don't fit with softmax applied logits from the model output. I apply the solution proposed by @jogonba2 which is in my case:

cls_explainer = SequenceClassificationExplainer(model, tokenizer)
cls_explainer.accepts_position_ids = False

work for SequenceClassificationExplainer but not for MultiLabelClassificationExplainer. So I guess that implement the trick in both respectives constructors args could be a solution. Thanks

Aug 18 '22 19:08 VDuchauffour

Hi, @VDuchauffour (and all who have run into this problem) I am currently working on a new release that has a fix for this. It effectively just implements @jogonba2's excellent solution by inspecting the model's architecture type in the constructor and sets both accepts_token_type_ids and accepts_position_ids to False by if it is of type roberta. For some of the limited testing I've done I've finally been able to get the model predictions in line for the SequenceClassificationExplainer and MultiLabelClassificationExplainer. I'm hoping to release it in the next week or so. I'll mention this post in the release notes.

This is a difficult issue to debug but it does seem to be related to RoBERTa's unconventional implementation of position_ids discussed a bit here https://github.com/huggingface/transformers/issues/10736 and here https://github.com/huggingface/transformers/issues/5285. I would need to do a big more digging into the original implementation by fairseq to get to the bottom of it, but I suspect this fix will do for most users.

Aug 18 '22 22:08 cdpierse

Hi everyone, this issue should be partially (maybe totally) addressed by the latest release #99, I've done some tests myself and the mismatch between the output probabilities seems to be correct or within an acceptable threshold. RoBERTa makes some interesting architecture changes over vanilla BERT when it comes to id's and input representations. I'd like to close this issue if possible but would like to hear some feedback on this update first. So if any of you are able to test out the issues you had previously let me know, otherwise I will likely close this issue within the next week.

Aug 23 '22 15:08 cdpierse

transformers-interpret transformers-interpret copied to clipboard

Max class probability too low with a multi-class classifier

transformers-interpret
transformers-interpret copied to clipboard