Swin-Transformer
Swin-Transformer copied to clipboard
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
i want to know about data set, how enlarge the ImageNet-22K datasets by 5 times to reach million images with noisy labels? just data augmentation? hoping for your replation~ thank...
Thanks for your remarkable work, when I use the pretrained model ,I found it difficult to scale the input Image size,even I use the image size like `window_size*2^n` , it...
How can I get the model score on single image to classify the image as real or fake Thanks
The parameter `ape`, which stands for adding absolute position embedding to the patch embedding, is set `False` by default in the config file. To my knowledge, for transformer models, input...
Thanks for releasing the code! However, we use your codebase for training and testing some models and evaluate their performance on ImageNet1K(w/o pretrained on ImageNet22K). However, according to the training...
Great works! I have some questions about fine-tuning on ImageNet 1K. In the paper, you claimed 384^2 input models are obtained by fine-tuning as also pointed by #24: >For other...
can you provide the parameter setting for finetune on 384 from 224?
The current configs seem to be containing the fine-tuning process from imagenet22k pretrained to imagenet 1k only. Could you please provide the imagenet22k pretrain schedule?
Thanks for you great work. I want to fine-tune on ImageNet-1K by ImageNet-21K pretrained bacause it's more faster to reproduce your result. So how to do it? I try to...
apex