Han Hu comments

Results 39 comments of


                                            Han Hu

trafficstars

Got error when I recompiled mxnet after copying operators by cp -r ./DCNv2_op/* ../mxnet/src/operator/contrib

The operators are tested only under branch of 1.1.0 and 1.3.0. You may try them under these branches.

one problem in the network training

If you encounter NaN, please try more times until there is no NaN. Some random initialization might cause divergence problem. If problem still exists, it might because the base lr...

Allow arbitrary-sized images by dynamic masking: upstream changes from Swin-Transformer-Object-Detection / SOLQ

> Hi! > > To combine Swin transformer backbone with Deformable DETR detector, [SOLQ](https://github.com/megvii-research/SOLQ/blob/main/models/swin_transformer.py) did some changes to `swin_transformer.py` that allow to compute the padding mask dynamically and allow for...

SwinV2 Transformer for Compressed Image Super-Resolution and Restoration

> Thanks for your sharing! I suggest that you may also try the Swin V2 models with SimMIM pretraining. In our experience, SimMIM pre-training should be more friendly for low-level...

Res-post-norm (V2) vs Res-pre-norm (V1)

> The pre-norm only bounds the activations of input but not output. The output could accumulate to be larger and larger in deeper layers.

Why stoping grad for the bias of projection on k?

> h.cat((self.q_bias, torch.zeros_like(self.v_bias, requires_grad=False), self.v_bias)) It is equivalent to the algorithm with k bias but simpler. You can derive it yourself, very simple.

Can I directly train swin v2 large on imagenet 1k ?

> No, it will not get better accuracy. But if you use SimMIM pre-training, Swin V2-L will perform better than Swin V2-B. Please try https://github.com/microsoft/Swin-Transformer/blob/main/get_started.md#simmim-support

Question about FFN

> Swin V1 uses pre-norm layers. Swin V2 uses a new normalization configuration named res-post-norm. Please look into https://arxiv.org/pdf/2111.09883.pdf for details.

Is multi-label classification for image classification supported?

> I am trying to train an image classifier where image ground truth contains multiple classes. Is it possible to train a model that outputs multiple classes? Yes, it should...