danet-pytorch
danet-pytorch copied to clipboard
torch.max(attention, dim=-1, keepdim=True)[0].expand_as(attention) - attention
This line may reverse the weight. when MAX - Attention, the positions with max attention weight becomes zero. I also did not find relevant information in the paper. Why add this line?
Did you figure out the issue?
I just deleted this line and do my own experiments with no anomaly spotted.