attention-learn-to-route icon indicating copy to clipboard operation
attention-learn-to-route copied to clipboard

Masking in SHA

Open shagharabbani opened this issue 2 years ago • 1 comments

Hi,

Would it be possible to apply masking only in the decoder single head attention? I think we have masking in both MHA and SHA in the decoder.

Best, Shaghayegh

shagharabbani avatar Feb 23 '23 18:02 shagharabbani

Hi @shagharabbani, I think this would definitely be possible but is currently not implemented, also I'm not completely sure why you'd want that but feel free to try it!

wouterkool avatar May 30 '23 13:05 wouterkool