mmsegmentation
mmsegmentation copied to clipboard
[Feature] Add HRViT (CVPR'2022)
Motivation
Add HRViT (resolve #1730) described in "Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation" which is a new vision transformer backbone design for semantic segmentation. It has a multi-branch high-resolution (HR) architecture with enhanced multi-scale representability, surpassing state-of-the-art MiT and CSWin backbones with an average of +1.78 mIoU improvement, 28% parameter saving, and 21% FLOPs reduction on ADE20K and Cityscapes.
Modification
New HRViT backbone and sample config files are added.
Checklist
- Pre-commit or other linting tools are used to fix the potential lint issues. ✅
- The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness. ❌ - need to add them later, might need some help
- If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D. - ❗ - not sure if it applies, but this backbone could be used in MMDet potentially as well.
- The documentation has been modified accordingly, like docstring or example tutorials. ❌ - need to add documentation and docstrings later on
Notes about LICENSE:
Implementation mostly borrowed from the original repository with slight modifications to make it compatible with mmsegmentation, kept the copyright notices in the files.
Hi @lorinczszabolcs Thanks for your nice PR! Since the official repo doesn't release the pretrained weights, facebookresearch/HRViT#3, after discussion, we are considering to temporarily pending this PR. But we will continuously follow the work of HRViT, if there is release pretrain, we will continue to promote this PR, if you find HRViT author released pretrain, you can also come to this pr to leave a message. And if you don't mind, you could then follow the contribution guide to fix the lint error.
Hi @xiexinch !
Ok, it's understandable, hopefully they will release pretrained weights soon.
I followed the contribution guide, and used the pre-commit hooks as well, but didn't add docstrings yet in case if there would be changes to the code itself, that is why linting failed. In case if the PR will go ahead upon release of pretrained weights, and the code will be reviewed and confirmed to work, I will also add docstrings. Thanks for the feedback!