dinov2 Retrieving original pixel data from patches.

Retrieving original pixel data from patches.

Open CaedenMotley opened this issue 4 months ago • 6 comments

I am trying to retrieve the pixel data in relation to features rather than patches in relation to features. My issue is that I can not seem to find a way to receive the original pixel data contained within the patches. for example when a 210 x 210 image is passed through it will return a "x_norm_patchtokens" tensor of shape (1,225,1024). I would like to somehow transform this to be (210 x 210 x 1024) using the pixels contained within each patch rather than just the patch as a singular element. My original thought was to reshape into (√(225 x 14^2), √(225 x 14^2), 1024) but this will obviously yield a size much greater than the original tensor. Is this retrieval possible and if so any help as to how would be greatly appreciated. Thank you!

Feb 06 '24 21:02 CaedenMotley

dinov2 dinov2 copied to clipboard

Retrieving original pixel data from patches.

dinov2
dinov2 copied to clipboard