mage icon indicating copy to clipboard operation
mage copied to clipboard

Why encoder-decoder architecture?

Open bsxxdw opened this issue 1 year ago • 1 comments

Hi @LTH14! Congrats on your nice work being accepted by CVPR. Just as the title, I'm confused why you choose to use an encoder-decoder architecture like MAE? Have you ever tried using a encoder only arch like BEiT?

bsxxdw avatar Jun 21 '23 10:06 bsxxdw

We haven't tried an encoder-only structure like BEiT. The reason why we chose the MAE-style enc-dec structure is simply that they were the sota method at that time. Also, an encoder-decoder structure enables us to decouple the representation learning from the generation.

LTH14 avatar Jun 21 '23 18:06 LTH14