Demo implementation of Self-Supervision & Meta-Learning for One-Shot Unsupervised Cross-Domain Detection

Link to the paper: https://arxiv.org/abs/2106.03496

This paper is an extension of our ECCV20: One-Shot Unsupervised Cross-Domain Detection, the code is therefore based on the original implementation: https://github.com/VeloDC/oshot_detection.

The detection framework is inherited from Maskrcnn-benchmark and uses Pytorch and CUDA.

This readme will guide you through a full run of our method for the Pascal VOC -> AMD benchmarks.

Implementation details

We build on top of Faster-RCNN with a ResNet-50 Backbone pre-trained on ImageNet, 300 top proposals after non-maximum-suppression, anchors at three scales (128, 256, 512) and three aspect ratios (1:1, 1:2, 2:1).

For OSHOT we train the base network for 70k iterations using SGD with momentum set at 0.9, the initial learning rate is 0.001 and decays after 50k iterations. We use a batch size of 1, keep batch normalization layers fixed for both pretraining and adaptation phases and freeze the first 2 blocks of ResNet50. The weight of the rotation task is set to λ=0.05.

FULL-OSHOT is actually trained in two steps. For the first 60k iterations the training is identical to that of OSHOT, while in the last 10k iterations the meta-learning procedure is activated. The inner loop optimization on the self-supervised task runs with η=5 iterations and the batch size is 2 to accomodate for two transformations of the original image. Specifically we used gray-scale and color-jitter with brightness, contrast, saturation and hue all set to 0.4. All the other hyperparameters remain unchanged as in OSHOT.

Tran-OSHOT differs from OSHOT only for the last 10k learning iterations, where the batch size is 2 and the network sees multiple images with different visual appearance in one iteration. Meta-OSHOT is instead identical to FULL-OSHOT, made exception for the transformations which are dropped, thus the batch size is 1 also in the last 10k pretraining iterations.

The adaptation phase is the same for all the variants: the model obtained from the pretraining phase is updated via fine-tuning of the self-supervised task. The batch size is equal to 1 and a dropout with probability p = 0.5 is added before the rotation classifier to prevent overfitting. The weight of the auxiliary task is increased to λ=0.2 to speed up the adaptation process. All the other hyperparameters and settings are the same used during the pretraining.

Installation

Check INSTALL.md for installation instructions.

Datasets

Create a folder named datasets and include VOC2007 and VOC2012 source datasets (download from Pascal VOC's website).

Download and extract clipart1k, comic2k and watercolor2k from authors' website.

If you would like to get our Social Bikes dataset, please contact me (Francesco Cappio) via email.

Performing pretraining

To perform a standard OSHOT pretraing using Pascal VOC as source dataset:

python tools/train_net.py --config-file configs/amd/voc_pretrain.yaml

To perform an improved pretraining using our meta-learning based procedure:

python tools/train_net.py --config-file configs/amd/voc_pretrain_meta.yaml --meta

Once you have performed a pretraining you can test the output model directly on the target domain or perform the one-shot adaptation.

Testing pretrained model

You can test a pretrained model on one of the AMD referring to the correct config-file. For example for clipart:

python tools/test_net.py --config-file configs/amd/oshot_clipart_target.yaml --ckpt <pretrain_output_dir>/model_final.pth

Performing the one-shot adaptation

To use OSHOT adaptation procedure and obtain results on one of the AMD please refer to one of the config files. For example for clipart:

python tools/oshot_net.py --config-file configs/amd/oshot_clipart_target.yaml --ckpt <pretrain_output_dir>/model_final.pth

To perform the one-shot adaptation on a model trained with meta learning you need to refer the corresponding _meta config file:

python tools/oshot_net.py --config-file configs/amd/oshot_clipart_target_meta.yaml --ckpt <meta_pretrain_output_dir>/model_final.pth

Qualitative results

Some visualizations for OSHOT, Full-OSHOT and baseline methods:

Qualitative results

OSHOT-meta-learning
OSHOT-meta-learning copied to clipboard

Metadata

Demo implementation of Self-Supervision & Meta-Learning for One-Shot Unsupervised Cross-Domain Detection

Implementation details

Installation

Datasets

Performing pretraining

Testing pretrained model

Performing the one-shot adaptation

Qualitative results

← Metadata

Owner

Metadata

OSHOT-meta-learning OSHOT-meta-learning copied to clipboard

Metadata

Demo implementation of Self-Supervision & Meta-Learning for One-Shot Unsupervised Cross-Domain Detection

Implementation details

Installation

Datasets

Performing pretraining

Testing pretrained model

Performing the one-shot adaptation

Qualitative results

← Metadata

Owner

Metadata

OSHOT-meta-learning
OSHOT-meta-learning copied to clipboard