ViT-pytorch icon indicating copy to clipboard operation
ViT-pytorch copied to clipboard

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Results 29 ViT-pytorch issues
Sort by recently updated
recently updated
newest added

Hello, I would like to ask you about this step in README, The default batch size is 512. When GPU memory is insufficient, you can proceed with training by adjusting...

Hi, @jeonsworld The imagenet21k_R50+ViT-L_32.npz model has been released. Can you update how to use imagenet21k_R50+ViT-L_32.npz? I tried it on current your source, but I kept having problems. thanks

I added data enhancement methods such as translation, rotation, and scaling to the test data sample, hoping to use the inductive bias of CNN, but R50+ViT did not achieve the...

Sorry, there is training-time show in your experiment. I wonder which GPU did you use, and how many of them?

I noticed that you used class StdConv2d(nn.Conv2d): def forward(self, x): w = self.weight v, m = torch.var_mean(w, dim=[1, 2, 3], keepdim=True, unbiased=False) w = (w - m) / torch.sqrt(v +...

When use my custom dataset, which contains 6 classes, so I modified the data_utils.py, and change the 'num_classes = 6' in train.py. But I got these errors: Training (X /...

vit = VisionTransformer(CONFIGS['R50-ViT-B_16'], zero_head=False, img_size=200) leads to "float division by zero" exception: --------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) in ----> 1 vit = VisionTransformer(CONFIGS['R50-ViT-B_16'], zero_head=False, img_size=200) ViT-pytorch/models/modeling.py in __init__(self,...

Hi, perhaps, I didn't understand your code clearly, but I think you didn't split the dataset into train set, val set and test set. You straightly validate and test your...