ViT-pytorch icon indicating copy to clipboard operation
ViT-pytorch copied to clipboard

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Results 28 ViT-pytorch issues
Sort by recently updated
recently updated
newest added

https://github.com/jeonsworld/ViT-pytorch/blob/460a162767de1722a014ed2261463dbbc01196b6/models/modeling.py#L132C1-L136C31 Here it seems the calculation for `n_patches` and `patch_size` are incorrect. I think if 16 x 16 patches were assumed, the grid size is already determined? Or am I...

Would you like to ask about the model code for the reconstruction task ? Or the author Do you have any plans to do the reconstruction task ?

Thanks for your great job! I am curious why we need to calculate residual connections when visualizing attention maps? ![image](https://user-images.githubusercontent.com/55536181/146325681-1a608405-f341-4f8c-824d-f4829b2a0ea1.png)

When using -m torch.distributed.launch --nproc_per_node=2 as an example it uses --local-rank and not the expected --local_rank which generate a fatal error. https://github.com/jeonsworld/ViT-pytorch/blob/460a162767de1722a014ed2261463dbbc01196b6/train.py#L281 Setting the arg to local-rank in train.py resolved...

hello,guys! The link cannot be opened in visualize_attention_map.ipynb. Could you please update the link? detail code: img_url = "https://images.mypetlife.co.kr/content/uploads/2019/04/09192811/welsh-corgi-1581119_960_720.jpg"

What is the accuracy of training from scratch on CIFAR100? Do you have tried?

Hello, In "visualize_attention_map.ipynb", the trained model is loaded in the following line: model.load_from(np.load("attention_data/Vit-B_16-224.npz")) I used your train.py to finetune the Vit-B_16-224.npz with my custom data, which produced a Pytorch model...

Would you like a docker file? Will send it :)

Hi, Thank you for your nice implementation. I get the following error when loading the pre-trained weights: _KeyError: 'Transformer/encoderblock_0\\MultiHeadDotProductAttention_1/query\\kernel is not a file in the archive'_ Would you please help...