Gabriele Sarti comments

Results 46 comments of


                                            Gabriele Sarti

[Summary] Add gradient-based attribution methods

Hi @saxenarohit, in principle the Captum LRP implementation should be directly compatible with Inseq. However, the implementation is very model specific with some notable (and to my knowledge, presently unsolved)...

Save tensors in lower precision

Hey @LuukSuurmeijer, thanks a lot for this PR! I had a look and added some very minor fixes (add a Literal type for the allowed precision strings, added a docstring...

:hugs: transformers compatibility issues

Thanks for the quick answer! Sounds good for `attention_mask`. Regarding the return dictionary, in principle, having the sequences would already mean enabling most gradient/occlusion-based methods. Attention attribution is actively being...

Incorrect Attribution for T5 Seq2Seq Model in Captum

Here is a functioning version @mrektor @michelecafagna26 ```python from captum.attr import InputXGradient from transformers import pipeline pipe = pipeline('text2text-generation', model='google/flan-t5-base', tokenizer='google/flan-t5-base', device='cuda') input_ids = pipe.tokenizer(["A simple example"], return_tensors="pt", padding=True, truncation=True).input_ids.to('cuda')...

Malformed highlight tags

Hi @kayoyin @CoderPat, Could you please let me know if you intend to fix the issue with the highlights? Thank you in advance!

[Summary] Add internals-based feature attribution methods

@lsickert The attention-based feature attribution methods you mention involve the simplest case of taking the average attention weight for every token across all model layers, or the attention weight for...

[Summary] Add internals-based feature attribution methods

Good point, I'd say returning the attention scores by default shouldn't be a problem, and it's probably the easiest way to ensure compatibility with other methods without dramatic changes to...

[Summary] Add internals-based feature attribution methods

Wherever possible we want to make methods customizable, but with sensible defaults for those not interested in fiddling with them. For attention, the ideal setting would probably be to use...

[Summary] Add internals-based feature attribution methods

The latter would probably make more sense, deeming relevant what was relevant at least for one head. I don't have an intuition for what to expect from the results though,...

[Summary] Add internals-based feature attribution methods

This is an important point. In the current gradient attribution approaches, if the user does not provide a target output we first generate the output sentence using whatever strategy is...