Transformer-SSL issues

Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to xxxx

start cmd ``` imagenetpath=mypath CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 moby_main.py \ --cfg configs/moby_swin_tiny.yaml --data-path ${imagenetpath} --batch-size 256 ``` but get the ```Gradient overflow. Skipping step, loss...

AlphaNext

pretrain fails when image categories are similar

i want to use your work to perform few epochs pretrain on my dataset,which contains sevceral similar vehicle categories. So i load the imagenet-pretrained checkpoint and run another pretrain on...

894665169

Strange output log

12

Hi authors, I have pretrianed your moby_swin_tiny model using 8 Tesla V100 GPU and reproduced your results in downstream task. I get 74.394% on linear evaluation and 43.1% on COCO...

launchauto

Will apex mixed precision training affect the accuracy of the model?

Thank you very much for this great paper. I would like to ask, will apex mixed precision training affect the accuracy of the model? I tried to install using the...

Pang-b0

Download links of DeiT-S and Swin-T backbone models are interchanged

Download link of DeiT-S model: https://github.com/SwinTransformer/storage/releases/download/v1.0.3/moby_swin_t_300ep_pretrained.pth Download link of Swin-T model: https://github.com/SwinTransformer/storage/releases/download/v1.0.3/moby_deit_small_300ep_pretrained.pth ![image](https://user-images.githubusercontent.com/59405594/204264759-a7a5cf71-3112-4c2b-a567-823248132463.png) Look at the **last part** of the download link. I think the model links should be interchanged.

kavin-du

Train MoBy-SwinT on local machine with one GPU

2

I am gonna train MoBY-SwinT on my custom dataset. My machine has one GPU. I tried some but failed and faced following errors. All packages are installed. * First try...

TNA8

The interpolation method for BYOL augmentation is wrong

1

Under Transformer-SSL/data/build.py, inside the "build_transform" function, under "byol" augmentation type, the interpolation method used in RandomResizedCrop is the default which is BILINEAR, however in the BYOL paper the author used...

songkangsg

TypeError: 'Compose' object is not iterable

2

Traceback (most recent call last): File "moby_linear.py", line 385, in main(config) File "moby_linear.py", line 174, in main train_one_epoch(config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler) File "moby_linear.py", line 199, in...

chenhaoxing

Have you tried any other initial patch size in the swin transformer apart from the patch size = 4?

Hello dear authors, Thank you for providing your work and code. I understand from your paper that you used patch size = 4 in all your models, is there any...

sfarkya04

Some questions about relative_position_index and attn_mask

Wonderful job! I recently read you code and have some questions in Swin model which is shown in swin_transformer.py. Concretely, I can't understand the calculation formula of relative_position_index and attn_mask....

Haoqing-Wang

Transformer-SSL
Transformer-SSL copied to clipboard

Metadata

Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to xxxx

pretrain fails when image categories are similar

Strange output log

Will apex mixed precision training affect the accuracy of the model?

Download links of DeiT-S and Swin-T backbone models are interchanged

Train MoBy-SwinT on local machine with one GPU

The interpolation method for BYOL augmentation is wrong

TypeError: 'Compose' object is not iterable

Have you tried any other initial patch size in the swin transformer apart from the patch size = 4?

Some questions about relative_position_index and attn_mask

← Metadata

Owner

Metadata

Transformer-SSL Transformer-SSL copied to clipboard

Metadata

← Metadata

Owner

Metadata

Transformer-SSL
Transformer-SSL copied to clipboard