What does this repository contain

This repository contains the weights of UNet models trained on RGB as well as RGB-D data of SceneNet RGB-D dataset.
It has code to reproduce the UNet used in the paper and also provides segmentation evaluation scripts.
The test_models.py contains the code to reproduce the numbers as obtained in the ICCV 2017 paper.

Important things to keep in mind before using the code

Download the pytorch models from the google drive link. It contains 10 models in pth format and overall 5.8 GBs in total in size.
This code was converted from the torch implementation used in the paper. The image scaling in torch is different from the OpenCV/PIL image scaling (see the torch github thread) and therefore we provide the rgb and depth files converted from torch in npy format. However, when using these mdoels to fine-tune we believe it should not be a problem using any different image scaling algorithm -- minor scaling discrepancies can be easily subsumed by the fine-tuning process. We only wanted to make sure here that the models produce exactly the numbers stated in the paper.
The depth scaling used for NYUv2 was 1/1000 and SUN RGB-D was 1/10000. This means that if you are using the NYUv2 pretrained SceneNet RGB-D model that was fine-tuned on NYUv2 dataset then you should scale down the depth values by a factor of 1000 before using it for any new future experiments. Similarly, you should scale down the depth values by 10000 if you are using SUN RGB-D pretrained on SceneNet RGB-D.
To obtain the numbers in the paper for 13 class segmentations do python test_models.py
If you would like to get the filtered dataset with labels greater than 3 per image it is here at google drive link. It contains the names of the files not the pngs and is 23MB in size.