prompt-to-prompt
prompt-to-prompt copied to clipboard
Question about the influence of softmax function on the issue of attention map swapping.
Hi, I found that attention map swapping is performed after the softmax operation. In that case, the sum of those similarities could not be equal to 1. I wonder if the authors have tried to conduct attention map swapping before the softmax operation.
I guess that before softmax, the absolute value does not has a specific meaning out of the context