esvit
esvit copied to clipboard
EsViT: Efficient self-supervised Vision Transformers
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.3 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
hi, I use the swin-transformer.py to load the swin-tiny model pretrained by imagenet1k. And the message is here: msg: _IncompatibleKeys(missing_keys=['layers.0.blocks.1.attn_mask', 'layers.1.blocks.1.attn_mask', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.5.attn_mask', 'head.weight', 'head.bias'], unexpected_keys=['head.mlp.0.weight', 'head.mlp.0.bias', 'head.mlp.2.weight', 'head.mlp.2.bias',...
Hello. Thank you for the wonderful work! I have some questions about the learning rate used to pretrain the Swin model in Table 1. As the logs show, the learning...
can I use the code to train in single GPU machine rtx5000?if yes, how to do it ?
It is useful in object detection context to allow arbitrary sizes by doing dynamic mask computation (probably possible only with relative position encoding). These kinds of edits were done in...
Hello, I have been studying your article recently. I noticed that your PPT described pre-train Task 2: region-level as shown in the picture above. But doesn't the actual code input...
Hi, I was playing around with a custom dataset with SwinTiny and ResNet50. SwinTiny works great (both training and Linear Evaluation). However, it seems like ResNet50 isn't supported in the...
Hello, Could you please provide the args used for running `main_esvit.py` with the right arguments for each run in the table below (first table in README)? Are the args used...
I found swin_large_patch4_window7_224.yaml config file in your code. Here is an interesting question that how about the performance for larger mode?
Hi! First of all kudos on the great work! So, I am experimenting on a custom dataset of about 70k images consisting of 7 different classes. However, the model seems...