Attention-Augmented-Conv2d Memory/Time Complexity of the relative positional encoding

Memory/Time Complexity of the relative positional encoding

Open PkuRainBow opened this issue 5 years ago • 1 comments

Thanks for your project.

I have some questions about the implementation of the relative positional encoding. According to your implementation, the memory cost is O((H^2W^2) while the paper mentions that they optimize the memory cost to O(HW).

Besides, I have also tried your method on the semantic segmentation tasks and find it is very slow and consumes a huge amount of memory.

I am wondering whether you have improved memory and time issues.

Jul 10 '19 07:07 PkuRainBow

Thanks for your comment !

memory cost

When I think that the conventional relative position encoding is O (H ^ 2W ^ 2) because it generates HxW matrix. However, the current code is O (HW) because it generates a 1d vector of H and W.

Time issues

I'll fix it as soon as possible. Thank you !

Jul 11 '19 02:07 leaderj1001

Attention-Augmented-Conv2d Attention-Augmented-Conv2d copied to clipboard

Memory/Time Complexity of the relative positional encoding

Attention-Augmented-Conv2d
Attention-Augmented-Conv2d copied to clipboard