pytorch-transformer icon indicating copy to clipboard operation
pytorch-transformer copied to clipboard

Clarification regarding dropout in the multihead attention block

Open anupsingh15 opened this issue 1 year ago • 0 comments

Hi @hkproj

Why do you add dropout to the attention scores (line 110 in model.py)? Shouldn't you discard the dropout in the multihead attention block because you already add a dropout (line 81) in the residual connection block?

anupsingh15 avatar Apr 07 '24 19:04 anupsingh15