openworld_ldet icon indicating copy to clipboard operation
openworld_ldet copied to clipboard

Learning to Detect Every Thing in an Open World

[Paper] | [Project Page] | [Demo Video]

If you use this code for your research, please cite:

Learning to Detect Every Thing in an Open World.

Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko. In Arxiv 2021. [Bibtex]

News

[2022/02/20] We find a bug in evaluation code and are rewriting the paper. Please contact the authors if you plan to cite our papers.

[2022/03/11] We fixed the bug in ldet/evaluation/coco_evaluation.py. The pascal class id was not indexed correctly. We will further update the repo following our updated paper.

[2022/04/11] Updated the evaluation code following new paper.

Installation

Requirements

Build LDET

  • Create a virtual environment. We used conda to create a new environment.
conda create --name ldet
conda activate ldet
  • Install PyTorch. You can choose the PyTorch and CUDA version according to your machine. Just make sure your PyTorch version matches the prebuilt Detectron2 version (next step). Example for PyTorch v1.10.0:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Currently, the codebase is compatible with Detectron2 v0.6. Example for PyTorch v1.10.0 and CUDA v11.3:

  • Install Detectron2 v0.6
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
  • Install other requirements.
python3 -m pip install -r requirements.txt

Code Structure

  • configs: Configuration files
  • ldet
    • data: Code related to dataset configuration.
    • data/copy_paste_mapper.py: Code for our data augmentation.
    • engine: Contains config file for training.
    • evaluation: Code used for evaluation.
    • modeling: Code for models, including backbones, prediction heads.
  • tools
    • trainer_copypaste.py: Training and testing script.
    • run_test.sh: Evaluation script
    • run_train.sh: training script

Data Preparation

We provide evaluation on COCO, UVO, and Mapillary (v2.0) in this repository:

  • COCO. Trained on train split, evaluated on validation split. Download the COCO dataset following the instruction of detectron2.

  • UVO. We downloaded uvo_videos_sparse.zip and evaluated on the videos. Follow their instructions to split videos into frames. The json file split used for evaluation is available in Dropbox Link Update the line in builtin.py.

E.g., the data structure of UVO dataset is as follows:

uvo_frames_sparse/video1/0.png
uvo_frames_sparse/video1/1.png
.
.
.
uvo_frames_sparse/video2/0.png
.

Trained models

The trained weights are available from link attached with model.

Method Training Dataset Evaluation Dataset box
AP
box
AR
seg
AP
seg
AR
Link
Mask RCNN VOC-COCO Non-VOC 1.5 10.9 0.7 9.1 model | config
Mask RCNNS VOC-COCO Non-VOC 3.4 18.0 2.2 15.8 model | config
LDET VOC-COCO Non-VOC 5.0 30.8 4.7 27.4 model | config
Mask RCNNS COCO UVO 25.3 42.3 20.6 35.9 model | config
LDET COCO UVO 25.8 47.5 21.9 40.7 model | config

Training & Evaluation

Training

To train a model, run

## Training on VOC-COCO
sh tools/run_train.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml save_dir
## Training on COCO
sh tools/run_train.sh configs/COCO/mask_rcnn_R_50_FPN.yaml save_dir
## Training on Cityscapes
sh tools/run_train.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml save_dir

Note that the training will produce two directories, i.e., one for normal models and the other for exponential moving averaged models. We used the latter for evaluation.

Evaluation

To evaluate the trained models, run

## Test on Non-VOC-COCO
sh tools/run_test.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on UVO, Obj365
sh tools/run_test.sh configs/COCO/mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on Mapillary
sh tools/run_test.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml weight_to_eval

The above script will show two results: agnostic mode and classwise mode. The agnostic mode regards all instances as a single class while classwise mode makes distinction on different classes. To consider class imbalance, we report AR in classwise mode in our paper while reporting AP in agnostic mode. Note that the above script computes performance on novel classes. To get performance on all classes, please disable the flag of exclude_known.