mmsegmentation icon indicating copy to clipboard operation
mmsegmentation copied to clipboard

[Features] Add support for Kitti semantic segmentation dataset

Open AkideLiu opened this issue 2 years ago • 26 comments

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Kitti semantic segmentation dataset is a lightweight dataset for semantic segmentation which shares the same label policy as cityscapes. It's an excellent starting point for segmentation and employs the weights pre-trained on cityscapes to perform transfer-learning, do you consider to support this dataset.

http://www.cvlibs.net/datasets/kitti/eval_semseg.php?benchmark=semantics2015

Modification

Please briefly describe what modification is made in this PR.

  1. add a converting tool to generalize the dataset format
  2. add customized dataset class in mmseg
  3. add config for the KITTI dataset

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

NO BC

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

AkideLiu avatar May 20 '22 09:05 AkideLiu

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar May 20 '22 09:05 CLAassistant

Per this issue discussed https://github.com/open-mmlab/mmsegmentation/issues/1599

AkideLiu avatar May 20 '22 09:05 AkideLiu

Hi, @AkideLiu thanks for your nice PR. We would review it asap.

Please fix the lint error.

MengzhangLI avatar May 20 '22 10:05 MengzhangLI

Codecov Report

Merging #1602 (1716a41) into master (0e37281) will decrease coverage by 0.00%. The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master    #1602      +/-   ##
==========================================
- Coverage   89.04%   89.04%   -0.01%     
==========================================
  Files         144      145       +1     
  Lines        8636     8643       +7     
  Branches     1458     1459       +1     
==========================================
+ Hits         7690     7696       +6     
- Misses        706      707       +1     
  Partials      240      240              
Flag Coverage Δ
unittests 89.04% <85.71%> (-0.01%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmseg/datasets/kitti.py 83.33% <83.33%> (ø)
mmseg/datasets/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 336435b...1716a41. Read the comment docs.

codecov[bot] avatar May 20 '22 10:05 codecov[bot]

Do you have related baseline or sota results on KITTI semantic segmentation dataset?

MengzhangLI avatar May 20 '22 10:05 MengzhangLI

Do you have related baseline or sota results on KITTI semantic segmentation dataset?

Hi @MengzhangLI , I did not have a baseline or SOTA results on this dataset because the methods that have been used in some publications are not implemented in the mmseg.

However, this dataset could be directly evaluated by pre-trained models or training from scratch based on mmseg, and I have successfully performed training and evaluation by UNet on this dataset.

I could provide some example training configurations but I do not have the resources to perform distributed learning to obtain a pre-trained model.

more info refers to baseline : https://paperswithcode.com/sota/semantic-segmentation-on-kitti-semantic

AkideLiu avatar May 20 '22 11:05 AkideLiu

Do you have related baseline or sota results on KITTI semantic segmentation dataset?

Hi @MengzhangLI , I did not have a baseline or SOTA results on this dataset because the methods that have been used in some publications are not implemented in the mmseg.

However, this dataset could be directly evaluated by pre-trained models or training from scratch based on mmseg, and I have successfully performed training and evaluation by UNet on this dataset.

I could provide some example training configurations but I do not have the resources to perform distributed learning to obtain a pre-trained model.

more info refers to baseline : https://paperswithcode.com/sota/semantic-segmentation-on-kitti-semantic

OK, would you mind updating your training results (using mmseg) and attaching results from other repo/paper in this PR. We could polish up this PR together with training some semantic segmentation models on our side (use our own 4x or 8x V100 GPUs).

MengzhangLI avatar May 20 '22 14:05 MengzhangLI

Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.

In this stage, I would append the UNet 1000 epochs results as follows:

+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 90.04 |  96.1 |
|    sidewalk   | 52.39 | 59.23 |
|    building   | 69.12 |  87.9 |
|      wall     | 33.87 | 42.57 |
|     fence     | 34.65 | 43.59 |
|      pole     | 51.69 | 64.28 |
| traffic light |  63.4 | 70.51 |
|  traffic sign |  42.8 | 45.97 |
|   vegetation  | 89.82 | 94.94 |
|    terrain    | 78.42 | 91.19 |
|      sky      | 95.64 | 98.02 |
|     person    |  7.94 |  9.48 |
|     rider     |  0.0  |  0.0  |
|      car      | 85.54 | 94.55 |
|     truck     |  9.93 | 28.45 |
|      bus      | 57.31 | 85.58 |
|     train     |  0.0  |  0.0  |
|   motorcycle  |  0.0  |  0.0  |
|    bicycle    |  2.94 |  3.29 |
+---------------+-------+-------+

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 90.25 | 45.55 | 53.46 |
+-------+-------+-------+

I will fix lint and update configs soon.

Could you please provide an email address where I could send the training logs?

Additionally, will you help to modify the training config according to multiple GPUs setups?

AkideLiu avatar May 20 '22 14:05 AkideLiu

Thanks. Training logs could be directly dropped in your replying blank, be like:

image

And I could train some models like DeepLabV3Plus, PSPNet and Swin Transformers models using MMSegmentation default settings. Their results should be better than yours (1 GPU UNet), let's keep in touch.

Best,

MengzhangLI avatar May 20 '22 15:05 MengzhangLI

Looking forward to the training and evaluation results for different network architectures on your end.

The log of the previous UNet has been attached below.

Could you please help to fix the lint issues?

57d9719b0dd1756f994bada9889f2149.txt

AkideLiu avatar May 21 '22 06:05 AkideLiu

I do not quite understand why the lint is failed...

isort....................................................................Failed
- hook id: isort
- files were modified by this hook

Fixing /home/runner/work/mmsegmentation/mmsegmentation/mmseg/datasets/__init__.py

yapf.....................................................................Failed
- hook id: yapf
- files were modified by this hook
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing docs/en/dataset_prepare.md

AkideLiu avatar May 21 '22 06:05 AkideLiu

I do not quite understand why the lint is failed...

isort....................................................................Failed
- hook id: isort
- files were modified by this hook

Fixing /home/runner/work/mmsegmentation/mmsegmentation/mmseg/datasets/__init__.py

yapf.....................................................................Failed
- hook id: yapf
- files were modified by this hook
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing docs/en/dataset_prepare.md

Seems like caused by unsuccessful installation about pre-commit. The error you showed usually caused by local code did not obey coding rule defined by pre-commit.

Try to follow: https://github.com/open-mmlab/mmsegmentation/blob/master/.github/CONTRIBUTING.md

If your OS is Ubuntu/Linux, the installation would be easy.

After successful installation, use pre-commit run --all-files command, and git add . to add those files.

MengzhangLI avatar May 21 '22 14:05 MengzhangLI

@xiexinch a new commit response to the review, if you have any suggestions plz let me know

AkideLiu avatar Jun 06 '22 10:06 AkideLiu

Hi @AkideLiu I'm searching baselines on the KITTI dataset, since we'll do some experiments on it. If you have any suggestions, welcome to let me know.

I do also work on this dataset to find an optimal solution and one suggestion is to use transfer learning by pre-trained weights on cityscapes.

AkideLiu avatar Jun 08 '22 01:06 AkideLiu

Hi @AkideLiu I'm searching baselines on the KITTI dataset, since we'll do some experiments on it. If you have any suggestions, welcome to let me know.

I do also work on this dataset to find an optimal solution and one suggestion is to use transfer learning by pre-trained weights on cityscapes.

Hi @AkideLiu , We need some publication results as the baseline for this dataset. If you find some published papers, please do not hesitate to contact us.

xiexinch avatar Jun 08 '22 11:06 xiexinch

Hi @MengzhangLI @AkideLiu How about using these results as the baseline? Ref MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020)

xiexinch avatar Jun 13 '22 08:06 xiexinch

Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.

In this stage, I would append the UNet 1000 epochs results as follows:

+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 90.04 |  96.1 |
|    sidewalk   | 52.39 | 59.23 |
|    building   | 69.12 |  87.9 |
|      wall     | 33.87 | 42.57 |
|     fence     | 34.65 | 43.59 |
|      pole     | 51.69 | 64.28 |
| traffic light |  63.4 | 70.51 |
|  traffic sign |  42.8 | 45.97 |
|   vegetation  | 89.82 | 94.94 |
|    terrain    | 78.42 | 91.19 |
|      sky      | 95.64 | 98.02 |
|     person    |  7.94 |  9.48 |
|     rider     |  0.0  |  0.0  |
|      car      | 85.54 | 94.55 |
|     truck     |  9.93 | 28.45 |
|      bus      | 57.31 | 85.58 |
|     train     |  0.0  |  0.0  |
|   motorcycle  |  0.0  |  0.0  |
|    bicycle    |  2.94 |  3.29 |
+---------------+-------+-------+

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 90.25 | 45.55 | 53.46 |
+-------+-------+-------+

I will fix lint and update configs soon.

Could you please provide an email address where I could send the training logs?

Additionally, will you help to modify the training config according to multiple GPUs setups?

Hi @AkideLiu,

Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized.

In this stage, I would append the UNet 1000 epochs results as follows:

+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 90.04 |  96.1 |
|    sidewalk   | 52.39 | 59.23 |
|    building   | 69.12 |  87.9 |
|      wall     | 33.87 | 42.57 |
|     fence     | 34.65 | 43.59 |
|      pole     | 51.69 | 64.28 |
| traffic light |  63.4 | 70.51 |
|  traffic sign |  42.8 | 45.97 |
|   vegetation  | 89.82 | 94.94 |
|    terrain    | 78.42 | 91.19 |
|      sky      | 95.64 | 98.02 |
|     person    |  7.94 |  9.48 |
|     rider     |  0.0  |  0.0  |
|      car      | 85.54 | 94.55 |
|     truck     |  9.93 | 28.45 |
|      bus      | 57.31 | 85.58 |
|     train     |  0.0  |  0.0  |
|   motorcycle  |  0.0  |  0.0  |
|    bicycle    |  2.94 |  3.29 |
+---------------+-------+-------+

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 90.25 | 45.55 | 53.46 |
+-------+-------+-------+

I will fix lint and update configs soon.

Could you please provide an email address where I could send the training logs?

Additionally, will you help to modify the training config according to multiple GPUs setups?

Hi @AkideLiu, I'd like to ask how your training is running, don't the ann of the dataset need to be converted to trainLabelIds first?

xiexinch avatar Jun 14 '22 08:06 xiexinch

Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and run the directories structure conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.

AkideLiu avatar Jun 14 '22 08:06 AkideLiu

Hi @MengzhangLI @AkideLiu How about using these results as the baseline? Ref MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020)

I have briefly gone through this paper, it's a quite good baseline as it has distinct performance report and the implementation is open-sourced for references

AkideLiu avatar Jun 14 '22 08:06 AkideLiu

Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.

I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the label id to train id.

xiexinch avatar Jun 14 '22 08:06 xiexinch

Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.

I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the train id to label id.

I do not fully understand this problem, would you explain more about this case?

AkideLiu avatar Jun 14 '22 09:06 AkideLiu

Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.

I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the train id to label id.

I do not fully understand this problem, would you explain more about this case?

Image annotations provided by Cityscapes and KITTI are annotated by label id. In training, we must convert the label id to train id. You can read the code from cityscapesscripts.

https://github.com/mcordts/cityscapesScripts/blob/aeb7b82531f86185ce287705be28f452ba3ddbb8/cityscapesscripts/helpers/labels.py#L64

I'd like to know how your training ran if you didn't do this conversion?

xiexinch avatar Jun 14 '22 09:06 xiexinch

Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and runs the format conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training.

I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the train id to label id.

I do not fully understand this problem, would you explain more about this case?

Image annotations provided by Cityscapes and KITTI are annotated by label id. In training, we must convert the label id to train id. You can read the code from cityscapesscripts.

https://github.com/mcordts/cityscapesScripts/blob/aeb7b82531f86185ce287705be28f452ba3ddbb8/cityscapesscripts/helpers/labels.py#L64

I'd like to know how your training ran if you didn't do this conversion?

Hi @xiexinch i will try to reproduce the training in a fresh environment and provide update soon.

AkideLiu avatar Jun 17 '22 10:06 AkideLiu

Do apologise for the delay in the progress, previously I was taking a competition which highly similar to this dataset (subset). Here are some experiments based on mmseg, https://github.com/UAws/CV-3315-Is-All-You-Need At this stage, we have completed the competition and will focus on this PR to provide resolution for unfinished parts.

AkideLiu avatar Jul 03 '22 05:07 AkideLiu

@xiexinch solution provided for converting labels, could you review this PR?

reference: https://github.com/navganti/kitti_scripts/blob/master/semantics/devkit/kitti_relabel.py

AkideLiu avatar Jul 16 '22 07:07 AkideLiu

@xiexinch solution provided for converting labels, could you review this PR?

reference: https://github.com/navganti/kitti_scripts/blob/master/semantics/devkit/kitti_relabel.py

Thanks for updating this PR, we'll review it asap. :)

xiexinch avatar Jul 16 '22 08:07 xiexinch