hierarchical-attention-networks Mask for attention weight

Mask for attention weight

Open 2g-XzenG opened this issue 6 years ago • 0 comments

Hi ematvey,

Thanks for sharing the code!

I notice the attention weights for sentence & word are not mask according to their actual length, which means the model will "pay attention" to the useless input. Is there a reason you didn't use a mask for the project?

Please correct me if I am wrong. Thanks! Xianlonb

Jun 14 '18 18:06 2g-XzenG

hierarchical-attention-networks hierarchical-attention-networks copied to clipboard

Mask for attention weight

hierarchical-attention-networks
hierarchical-attention-networks copied to clipboard