Han Hu

Results 39 comments of Han Hu
trafficstars

Try different lr and evaluate on your datasets. For classification, you can start by trying 1/10 of the pre-training learning rate

> Hi > > Thank you for your great work. My Image size is 112x112 and the head is 12 and my window size is 7. It does not work...

Please go to Swin V2 for an approach to deal with varying window resolutions.

@scott870430 You can try bicubic interpolation to leverage the pretrained model weights with different window size

We use single-label for pre-training. I have read a paper which converts the original ImageNet21K labels to multiple-label ones, but cannot remember the specific title. I would greatly appreciate if...

Please find details in Swin V2 for the approach to do de-duplication.

> hi, did you solve the issue? i also encountered the satiation where the grad_norm.avg is nan, but other items are ok (loss and grad.val) It does not affect training....

@jiandan42 Yes, it happens sometime. You can try setting grad_clip, or using the naive PyTorch fp16 support, or using DeepSpeed. We find the latter two mixed training frameworks are more...

This change is not crucial. Just to be more clear.