ydhong_hit

Results 19 comments of ydhong_hit

I estimate that the variance of some sota methods such as segmenter and BEiT is more than 1% mIoU on ade20k.

> Hi, can you provide the code of grounding transformer consistent with the paper, including the Feud and mixsoft not the single implement of Feud or mixsoft but the intact...

> Thanks for your interest. Initially, we did not release the configs and pre-trained weights for cityscapes because video data of cityscapes is extremely large (>300 GB). Nevertheless, we will...

> Hi ydhong. Have you tried training lawin-B2 for 160K iterations? The performance reported in table 1 is obtained by a 160k training course. Thanks for your reply. According to...

> Yes, it should be 512. Also, I recommend switching the proj_type in PatchEmbed from 'pool' to 'conv', which will take the group conv inlace of the mix pooling at...

During the inference stage, I find that your model requires the input resolution to be multiple of 64. For the ADE20K, I use the 'ResizeToMultiple' in mmseg to achieve this....

> Sorry for the late reply. Honestly, we have to delay the full-code release plan because lawin has not been accepted by any confs or journals up to now, and...

Resize_pos_embed function is not being used in [here](https://github.com/facebookresearch/dinov2/blob/main/dinov2/eval/segmentation_m2f/models/backbones/vit.py). What to do when pre-training resolution is inconsistent with downstream input resolution?

> Could you provide the train config file of EVA-02-L on cocostuff164k which achieves 53.7 mIoU? I find the train config in the appendix. But you do not use the...