AbSViT questions about top_down

questions about top_down_transform

Open wefwefWEF2 opened this issue 1 year ago • 1 comments

Hi, thanks a lot for your great work, and about top_down_transform I have some questions.

Here, why we use top_down_transform to multiply with masked_x again, because we have already got the selected feature.

top_down_transform = prompt[..., None] @ prompt[..., None].transpose(-1, -2) x = x @ top_down_transform * 5

Jan 15 '24 16:01 wefwefWEF2

Hi, that's a good question. This part is for selecting the relevant features on the channel dimension while the previous selection is on the spatial dimension. We find selecting on both dimensions can enhance the effect of top down attention.

Mar 21 '24 17:03 bfshi

AbSViT AbSViT copied to clipboard

questions about top_down_transform

AbSViT
AbSViT copied to clipboard