PyTorch-Batch-Attention-Seq2seq masking for attention coefficients

masking for attention coefficients

Open fermat97 opened this issue 7 years ago • 1 comments

I cannot understand why there is no masking operation for computing attention coefficients.

Oct 20 '18 14:10 fermat97

same question, too

Nov 11 '18 16:11 OleNet