Swin-MAE icon indicating copy to clipboard operation
Swin-MAE copied to clipboard

Question about Fig.2 in paper

Open Spritea opened this issue 4 months ago • 2 comments

Hi there, thanks for your great work!

I got a question when I read the paper. I am curious about the middle (reconstruction) image in Fig.2.

It looks like the reconstruction image is purely generated by the model, not a composition of reconstructed regions and non-masked regions.

However, this result looks different from the original MAE paper (attached below), which typically generates a smooth block only, without detailed textures. This is expected since the reconstruction loss is only calculated on the masked regions.

Therefore, it's interesting that the Swin-MAE model can generate rich-texture results for non-masked regions. Could you please explain this? Thanks!

image

image

Spritea avatar Oct 01 '24 15:10 Spritea