Attention-Augmented-Conv2d icon indicating copy to clipboard operation
Attention-Augmented-Conv2d copied to clipboard

Memory/Time Complexity of the relative positional encoding

Open PkuRainBow opened this issue 5 years ago • 1 comments

Thanks for your project.

I have some questions about the implementation of the relative positional encoding. According to your implementation, the memory cost is O((H^2W^2) while the paper mentions that they optimize the memory cost to O(HW).

Besides, I have also tried your method on the semantic segmentation tasks and find it is very slow and consumes a huge amount of memory.

I am wondering whether you have improved memory and time issues.

PkuRainBow avatar Jul 10 '19 07:07 PkuRainBow

Thanks for your comment !

  1. memory cost
  • When I think that the conventional relative position encoding is O (H ^ 2W ^ 2) because it generates HxW matrix. However, the current code is O (HW) because it generates a 1d vector of H and W.
  1. Time issues
  • I'll fix it as soon as possible. Thank you !

leaderj1001 avatar Jul 11 '19 02:07 leaderj1001