dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

Retrieving original pixel data from patches.

Open CaedenMotley opened this issue 4 months ago • 6 comments

I am trying to retrieve the pixel data in relation to features rather than patches in relation to features. My issue is that I can not seem to find a way to receive the original pixel data contained within the patches. for example when a 210 x 210 image is passed through it will return a "x_norm_patchtokens" tensor of shape (1,225,1024). I would like to somehow transform this to be (210 x 210 x 1024) using the pixels contained within each patch rather than just the patch as a singular element. My original thought was to reshape into (√(225 x 14^2), √(225 x 14^2), 1024) but this will obviously yield a size much greater than the original tensor. Is this retrieval possible and if so any help as to how would be greatly appreciated. Thank you!

CaedenMotley avatar Feb 06 '24 21:02 CaedenMotley