K-BERT icon indicating copy to clipboard operation
K-BERT copied to clipboard

attention socre problem

Open zhaiyutong opened this issue 3 years ago • 1 comments

Hi, thank you for your awesome work and code firstly!

When I am transformering your Pytorch code to Tensorflow, I encountered one question.

In your code, you handle the attention mask with visual matrix in bert_encoder.py , and then in your multi_headed_attn.py, you have the following code in the line 59

scores = scores + mask

I am wandering if that corresponds to the attention socre function (5) in your paper? the mask is the addtional M?

Thank you in advance for your responese

zhaiyutong avatar Dec 29 '20 06:12 zhaiyutong

Yes, the mask is represented as the matrix M in our paper.

autoliuweijie avatar Dec 29 '20 13:12 autoliuweijie