Hardness-Level-Learning
Hardness-Level-Learning copied to clipboard
Not All Pixels Are Equal: Learning Hardness Probability for Semantic Segmentation.
Not All Pixels Are Equal: Learning Pixel Hardness for Semantic Segmentation
Paper
Official PyTorch Implementation
Xin Xiao, Daiguo Zhou, Jiagao Hu, Yi Hu and Yongchao Xu
abstract: —Semantic segmentation has recently witnessed great progress. Despite the impressive overall results, the segmentation performance in some hard areas (e.g., small objects or thin parts) is still not promising. A straightforward solution is hard sample mining, which is widely used in object detection. Yet, most existing hard pixel mining strategies for semantic segmentation often rely on pixel’s loss value, which tends to decrease during training. Intuitively, the pixel hardness for segmentation mainly depends on image structure and is expected to be stable. In this paper, we propose to learn pixel hardness for semantic segmentation, leveraging hardness information contained in global and historical loss values. More precisely, we add a gradient-independent branch for learning a hardness level (HL) map by maximizing hardness-weighted segmentation loss, which is minimized for the segmentation head. This encourages large hardness values in difficult areas, leading to appropriate and stable HL map. Despite its simplicity, the proposed method can be applied to most segmentation methods with no and marginal extra cost during inference and training, respectively. Without bells and whistles, the proposed method achieves consistent/significant improvement (1.37% mIoU on average) over most popular semantic segmentation methods on Cityscapes dataset, and demonstrates good generalization ability across domains.
Usage
To reproduce the results in the paper, we recommend to follow the instructions below. Other versions of Pytorch and mmcv are not tested, but it may work.
Requirements
- Pytorch == 1.8.2
- mmcv-full == 1.4.5
Getting started
1. Install dependencies
Step 1: Create a conda environment and activate it.
conda create -n HardnessLevel python=3.7
conda activate HardnessLevel
Step 2: Install PyTorch and torchvision
pip3 install torch==1.8.2 torchvision==0.9.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111
Step 3: Install mmcv-full
pip install -U openmim
mim install mmcv-full==1.4.5
pip3 install matplotlib numpy packaging prettytable cityscapesscripts
2. Data preparation
cd mmsegmentation
mkdir data
Please follow the instructions of mmsegmentation for data preparation.
3. Training
For instance, training PSPNet-ResNet101 with HL on Cityscapes with 4 GPUs by:
bash ./tools/dist_train.sh configs/pspnet_hl/pspnet_r101-d8_769x769_40k_cityscapes_hl.py 4
4. Testing
For instance, test PSPNet-ResNet101 with HL on Cityscapes with 4 GPUs by:
bash ./tools/dist_test.sh configs/pspnet_hl/pspnet_r101-d8_769x769_40k_cityscapes_hl.py /path/pspnet.pth 4 --eval mIoU
BaiduNet Disk (jahh) Google Driver.
Note that you should replace /path/pspnet.pth
with the path you store the pth file. You are
supposed to get 80.65 mIoU on val set.
Results
Training logs can be found here. Experiments are implemented on a device with 8 A100-40GB GPUs. By using the HL map collected from PSPNet, we achieve consistent improvement over the new paradigm of semantic segmentation, Mask2Former.
Extensions
1. Domain generation semantic segmentation
Evaluate the performance for GTAV -> Cityscapes domain generation by:
bash ./tools/dist_test.sh configs/gta_hl/deeplab_gta2city_res101_hl.py /path/gta_hl.pth 4 --eval mIoU
BaiduNet Disk(ujra) Google Driver.
Note that you should replace /path/gta_hl.pth
with the path you store the pth file. You are
supposed to get 43.06 mIoU on val set.
Please refer to DAFormer for more details.
2. Semi-supervised semantic segmentation
Cityscapes: results are obtained by DeepLabv3+ with ResNet-101 backbone.
ResNet-101 | 1/16 | 1/8 | 1/4 | 1/2 |
---|---|---|---|---|
SupOnly | 65.7 | 72.5 | 74.4 | 77.8 |
U2PL (paper) | 70.3 | 74.4 | 76.5 | 79.1 |
U2PL (reproduced) | 71.1 | 75.2 | 75.9 | 78.4 |
U2PL + HL | 72.6 | 76.0 | 76.6 | 79.6 |
UniMatch (paper) | 75.7 | 77.3 | 78.7 | _ |
UniMatch + HL | 76.2 | 78.2 | 78.9 | _ |
Note: The results of UniMatch are obtained by the ORIGINAL VERSION (NOT CVPR2023).
License
This project is released under the Apache 2.0 license.
Acknowledgment
This code is built using mmsegmentation repositories. Thanks a lot for their great work!
Citation
@misc{xiao2023pixels,
title={Not All Pixels Are Equal: Learning Pixel Hardness for Semantic Segmentation},
author={Xin Xiao and Daiguo Zhou and Jiagao Hu and Yi Hu and Yongchao Xu},
year={2023},
eprint={2305.08462},
archivePrefix={arXiv},
primaryClass={cs.CV}
}