dinov2
dinov2 copied to clipboard
how to infer image of a segmentation or depth like the demo
have not find an easy way to do that. managed to get features.
`import torch from PIL import Image import torchvision.transforms as T import hubconf
dinov2_vits14 = hubconf.dinov2_vits14()
image_transforms = T.Compose([ T.Resize(256, interpolation=T.InterpolationMode.BICUBIC), T.CenterCrop(224), T.ToTensor(), T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ])
img = Image.open('pic.jpg') img = image_transforms(img)[:3].unsqueeze(0)
features=dinov2_vits14.forward_features(img)
for k, v in features.items(): if v is not None: print(k, v.shape) else: print(k, v) `
x_norm_clstoken torch.Size([1, 384]) x_norm_patchtokens torch.Size([1, 256, 384]) x_prenorm torch.Size([1, 257, 384]) masks None