mmsegmentation
mmsegmentation copied to clipboard
[Features] Add support for Kitti semantic segmentation dataset
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
Kitti semantic segmentation dataset is a lightweight dataset for semantic segmentation which shares the same label policy as cityscapes. It's an excellent starting point for segmentation and employs the weights pre-trained on cityscapes to perform transfer-learning, do you consider to support this dataset.
http://www.cvlibs.net/datasets/kitti/eval_semseg.php?benchmark=semantics2015
Modification
Please briefly describe what modification is made in this PR.
- add a converting tool to generalize the dataset format
- add customized dataset class in mmseg
- add config for the KITTI dataset
BC-breaking (Optional)
Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
NO BC
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist
- Pre-commit or other linting tools are used to fix the potential lint issues.
- The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
- If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
- The documentation has been modified accordingly, like docstring or example tutorials.
Per this issue discussed https://github.com/open-mmlab/mmsegmentation/issues/1599
Hi, @AkideLiu thanks for your nice PR. We would review it asap.
Please fix the lint error.
Codecov Report
Merging #1602 (1716a41) into master (0e37281) will decrease coverage by
0.00%
. The diff coverage is85.71%
.
@@ Coverage Diff @@
## master #1602 +/- ##
==========================================
- Coverage 89.04% 89.04% -0.01%
==========================================
Files 144 145 +1
Lines 8636 8643 +7
Branches 1458 1459 +1
==========================================
+ Hits 7690 7696 +6
- Misses 706 707 +1
Partials 240 240
Flag | Coverage Δ | |
---|---|---|
unittests | 89.04% <85.71%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
mmseg/datasets/kitti.py | 83.33% <83.33%> (ø) |
|
mmseg/datasets/__init__.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 336435b...1716a41. Read the comment docs.
Do you have related baseline or sota results on KITTI semantic segmentation dataset?
Do you have related baseline or sota results on KITTI semantic segmentation dataset?
Hi @MengzhangLI , I did not have a baseline or SOTA results on this dataset because the methods that have been used in some publications are not implemented in the mmseg.
However, this dataset could be directly evaluated by pre-trained models or training from scratch based on mmseg, and I have successfully performed training and evaluation by UNet on this dataset.
I could provide some example training configurations but I do not have the resources to perform distributed learning to obtain a pre-trained model.
more info refers to baseline : https://paperswithcode.com/sota/semantic-segmentation-on-kitti-semantic
Do you have related baseline or sota results on KITTI semantic segmentation dataset?
Hi @MengzhangLI , I did not have a baseline or SOTA results on this dataset because the methods that have been used in some publications are not implemented in the mmseg.
However, this dataset could be directly evaluated by pre-trained models or training from scratch based on mmseg, and I have successfully performed training and evaluation by UNet on this dataset.
I could provide some example training configurations but I do not have the resources to perform distributed learning to obtain a pre-trained model.
more info refers to baseline : https://paperswithcode.com/sota/semantic-segmentation-on-kitti-semantic
OK, would you mind updating your training results (using mmseg) and attaching results from other repo/paper in this PR. We could polish up this PR together with training some semantic segmentation models on our side (use our own 4x or 8x V100 GPUs).
Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.
In this stage, I would append the UNet 1000 epochs results as follows:
+---------------+-------+-------+
| Class | IoU | Acc |
+---------------+-------+-------+
| road | 90.04 | 96.1 |
| sidewalk | 52.39 | 59.23 |
| building | 69.12 | 87.9 |
| wall | 33.87 | 42.57 |
| fence | 34.65 | 43.59 |
| pole | 51.69 | 64.28 |
| traffic light | 63.4 | 70.51 |
| traffic sign | 42.8 | 45.97 |
| vegetation | 89.82 | 94.94 |
| terrain | 78.42 | 91.19 |
| sky | 95.64 | 98.02 |
| person | 7.94 | 9.48 |
| rider | 0.0 | 0.0 |
| car | 85.54 | 94.55 |
| truck | 9.93 | 28.45 |
| bus | 57.31 | 85.58 |
| train | 0.0 | 0.0 |
| motorcycle | 0.0 | 0.0 |
| bicycle | 2.94 | 3.29 |
+---------------+-------+-------+
+-------+-------+-------+
| aAcc | mIoU | mAcc |
+-------+-------+-------+
| 90.25 | 45.55 | 53.46 |
+-------+-------+-------+
I will fix lint and update configs soon.
Could you please provide an email address where I could send the training logs?
Additionally, will you help to modify the training config according to multiple GPUs setups?
Thanks. Training logs could be directly dropped in your replying blank, be like:
And I could train some models like DeepLabV3Plus, PSPNet and Swin Transformers models using MMSegmentation default settings. Their results should be better than yours (1 GPU UNet), let's keep in touch.
Best,
Looking forward to the training and evaluation results for different network architectures on your end.
The log of the previous UNet has been attached below.
Could you please help to fix the lint issues?
I do not quite understand why the lint is failed...
isort....................................................................Failed
- hook id: isort
- files were modified by this hook
Fixing /home/runner/work/mmsegmentation/mmsegmentation/mmseg/datasets/__init__.py
yapf.....................................................................Failed
- hook id: yapf
- files were modified by this hook
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook
Fixing docs/en/dataset_prepare.md
I do not quite understand why the lint is failed...
isort....................................................................Failed - hook id: isort - files were modified by this hook Fixing /home/runner/work/mmsegmentation/mmsegmentation/mmseg/datasets/__init__.py yapf.....................................................................Failed - hook id: yapf - files were modified by this hook Trim Trailing Whitespace.................................................Passed Check Yaml...............................................................Passed Fix End of Files.........................................................Failed - hook id: end-of-file-fixer - exit code: 1 - files were modified by this hook Fixing docs/en/dataset_prepare.md
Seems like caused by unsuccessful installation about pre-commit
. The error you showed usually caused by local code did not obey coding rule defined by pre-commit
.
Try to follow: https://github.com/open-mmlab/mmsegmentation/blob/master/.github/CONTRIBUTING.md
If your OS is Ubuntu/Linux, the installation would be easy.
After successful installation, use pre-commit run --all-files
command, and git add .
to add those files.
@xiexinch a new commit response to the review, if you have any suggestions plz let me know
Hi @AkideLiu I'm searching baselines on the KITTI dataset, since we'll do some experiments on it. If you have any suggestions, welcome to let me know.
I do also work on this dataset to find an optimal solution and one suggestion is to use transfer learning by pre-trained weights on cityscapes.
Hi @AkideLiu I'm searching baselines on the KITTI dataset, since we'll do some experiments on it. If you have any suggestions, welcome to let me know.
I do also work on this dataset to find an optimal solution and one suggestion is to use transfer learning by pre-trained weights on cityscapes.
Hi @AkideLiu , We need some publication results as the baseline for this dataset. If you find some published papers, please do not hesitate to contact us.
Hi @MengzhangLI @AkideLiu How about using these results as the baseline? Ref MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020)
Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.
In this stage, I would append the UNet 1000 epochs results as follows:
+---------------+-------+-------+ | Class | IoU | Acc | +---------------+-------+-------+ | road | 90.04 | 96.1 | | sidewalk | 52.39 | 59.23 | | building | 69.12 | 87.9 | | wall | 33.87 | 42.57 | | fence | 34.65 | 43.59 | | pole | 51.69 | 64.28 | | traffic light | 63.4 | 70.51 | | traffic sign | 42.8 | 45.97 | | vegetation | 89.82 | 94.94 | | terrain | 78.42 | 91.19 | | sky | 95.64 | 98.02 | | person | 7.94 | 9.48 | | rider | 0.0 | 0.0 | | car | 85.54 | 94.55 | | truck | 9.93 | 28.45 | | bus | 57.31 | 85.58 | | train | 0.0 | 0.0 | | motorcycle | 0.0 | 0.0 | | bicycle | 2.94 | 3.29 | +---------------+-------+-------+ +-------+-------+-------+ | aAcc | mIoU | mAcc | +-------+-------+-------+ | 90.25 | 45.55 | 53.46 | +-------+-------+-------+
I will fix lint and update configs soon.
Could you please provide an email address where I could send the training logs?
Additionally, will you help to modify the training config according to multiple GPUs setups?
Hi @AkideLiu,
Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.
In this stage, I would append the UNet 1000 epochs results as follows:
+---------------+-------+-------+ | Class | IoU | Acc | +---------------+-------+-------+ | road | 90.04 | 96.1 | | sidewalk | 52.39 | 59.23 | | building | 69.12 | 87.9 | | wall | 33.87 | 42.57 | | fence | 34.65 | 43.59 | | pole | 51.69 | 64.28 | | traffic light | 63.4 | 70.51 | | traffic sign | 42.8 | 45.97 | | vegetation | 89.82 | 94.94 | | terrain | 78.42 | 91.19 | | sky | 95.64 | 98.02 | | person | 7.94 | 9.48 | | rider | 0.0 | 0.0 | | car | 85.54 | 94.55 | | truck | 9.93 | 28.45 | | bus | 57.31 | 85.58 | | train | 0.0 | 0.0 | | motorcycle | 0.0 | 0.0 | | bicycle | 2.94 | 3.29 | +---------------+-------+-------+ +-------+-------+-------+ | aAcc | mIoU | mAcc | +-------+-------+-------+ | 90.25 | 45.55 | 53.46 | +-------+-------+-------+
I will fix lint and update configs soon.
Could you please provide an email address where I could send the training logs?
Additionally, will you help to modify the training config according to multiple GPUs setups?
Hi @AkideLiu, I'd like to ask how your training is running, don't the ann of the dataset need to be converted to trainLabelIds first?
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and run the directories structure conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.
Hi @MengzhangLI @AkideLiu How about using these results as the baseline? Ref MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020)
I have briefly gone through this paper, it's a quite good baseline as it has distinct performance report and the implementation is open-sourced for references
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.
I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the label id
to train id
.
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.
I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the
train id
tolabel id
.
I do not fully understand this problem, would you explain more about this case?
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.
I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the
train id
tolabel id
.I do not fully understand this problem, would you explain more about this case?
Image annotations provided by Cityscapes and KITTI are annotated by label id. In training, we must convert the label id to train id. You can read the code from cityscapesscripts.
https://github.com/mcordts/cityscapesScripts/blob/aeb7b82531f86185ce287705be28f452ba3ddbb8/cityscapesscripts/helpers/labels.py#L64
I'd like to know how your training ran if you didn't do this conversion?
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.
I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the
train id
tolabel id
.I do not fully understand this problem, would you explain more about this case?
Image annotations provided by Cityscapes and KITTI are annotated by label id. In training, we must convert the label id to train id. You can read the code from cityscapesscripts.
https://github.com/mcordts/cityscapesScripts/blob/aeb7b82531f86185ce287705be28f452ba3ddbb8/cityscapesscripts/helpers/labels.py#L64
I'd like to know how your training ran if you didn't do this conversion?
Hi @xiexinch i will try to reproduce the training in a fresh environment and provide update soon.
Do apologise for the delay in the progress, previously I was taking a competition which highly similar to this dataset (subset). Here are some experiments based on mmseg, https://github.com/UAws/CV-3315-Is-All-You-Need At this stage, we have completed the competition and will focus on this PR to provide resolution for unfinished parts.
@xiexinch solution provided for converting labels, could you review this PR?
reference: https://github.com/navganti/kitti_scripts/blob/master/semantics/devkit/kitti_relabel.py
@xiexinch solution provided for converting labels, could you review this PR?
reference: https://github.com/navganti/kitti_scripts/blob/master/semantics/devkit/kitti_relabel.py
Thanks for updating this PR, we'll review it asap. :)