axial-deeplab re-implement Stand-Alone Self-Attention model

re-implement Stand-Alone Self-Attention model

Open d-li14 opened this issue 4 years ago • 2 comments

Hi, @csrhddlam As we discussed before, I am trying to re-implement the baseline "Conv-stem+Attention" in Stand-Alone Self-Attention in Vision Models, which is referred in your paper. Could you please help check the correctness? It will be better if you could provide further optimization of this implementation. Thanks!

Sep 06 '20 13:09 d-li14

Hi, @d-li14 Thanks for contributing. It looks correct to me, but the unfolding implementation could take a lot of memory. Could you check if the model really runs on 224x224 images and if it can reproduce the results in the paper? Thanks!

Sep 06 '20 15:09 csrhddlam

Yes, it is very memory-consuming, a simple test shows more than 7G memory is used with 8 images per GPU. I will try to verify the accuracy of this model.

Sep 06 '20 16:09 d-li14

axial-deeplab axial-deeplab copied to clipboard

re-implement Stand-Alone Self-Attention model

axial-deeplab
axial-deeplab copied to clipboard