candle icon indicating copy to clipboard operation
candle copied to clipboard

fix: wrong mask for distilbert::MultiHeadSelfAttention

Open silver-ymz opened this issue 7 months ago • 0 comments

It seems that attention mask should be reversed first in distilbert::MultiHeadSelfAttention

https://github.com/huggingface/transformers/blob/main/src/transformers/models/distilbert/modeling_distilbert.py#L218

silver-ymz avatar Apr 11 '25 12:04 silver-ymz