wsod_transfer icon indicating copy to clipboard operation
wsod_transfer copied to clipboard

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer.

WSOD with Progressive Knowledge Transfer

The code accompanies the following paper:

Relevant diff from the original maskrcnn-benchmark in commit ecc6b5f . Please follow the instructions README.old.md to setup the environment. Code has been tested with pytorch 1.4 and cuda 10.1.

Key files

  • wsod/coco60_to_voc: YAML configs for the COCO60-to-VOC experiment.
  • wsod/pseudo_label.py: pseudo ground truth mining script on source and target datasets. The output annotations json files will be saved to ./datasets/coco/annotations/ and ./datasets/voc/VOC2007/.
  • maskrcnn_benchmark/modeling/detector/weak_transfer.py: the multi-instance learning in the paper.

Data

Follow instructions in README.old.md to setup the datasets folder. Annotations for the COCO-60, COCO-60-full and VOC datasets on Google Drive.

  • coco60_train2017_21987.json, coco60_val2017_969.json, coco60full_train2017_118287.json, coco60full_val2017_5000.json: place under folder ./datasets/coco/annotations/
  • voc_2007_trainval.json, voc_2007_test.json: place under ./datasets/voc/VOC2007/

Demo to reproduce the COCO-60 to VOC experiment

Run the following commands to train 3 iterations of the algorithm described in the paper. (They can be wrapped in a single shell script to run together.)

Initial Iteration (K=0)
# result: output/coco60_to_voc/ocud_it0
python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/ocud_it0.yaml
# result: output/coco60_to_voc/mil_it0
python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/mil_it0.yaml
# result: datasets/voc/VOC2007/voc_2007_trainval_coco60-to-voc_it0_0.8.json
# and datasets/coco/annotations/coco60_{train,val}2017_coco60-to-voc_it0_0.8
python wsod/pseudo_label.py output/coco60_to_voc/mil_it0 coco60_train2017_21987 coco60_val2017_969 coco60-to-voc 0 0.8 | tee output/coco60_to_voc/mil_it0/pseudo.txt
1st Refinement (K=1)
# ocud_it1
python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/ocud_it1.yaml --start_iter 0
# mil_it1
python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/mil_it1.yaml --start_iter 0
# pseudo GT it1
python wsod/pseudo_label.py output/coco60_to_voc/mil_it1 coco60_train2017_21987 coco60_val2017_969 coco60-to-voc 1 0.8 | tee output/coco60_to_voc/mil_it1/pseudo.txt
2nd Refinement (K=2)

We can make duplicates ocud_it2.yaml and mil_it2.yaml for this, or reuse the previous configs and specify the paths as follows.

python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/ocud_it1.yaml --start_iter 0 OUTPUT_DIR "output/coco60_to_voc/ocud_it2" MODEL.WEIGHT "output/coco60_to_voc/ocud_it1/model_final.pth" DATASETS.TRAIN "('coco60_train2017_coco60-to-voc_it1_0.8','coco60_val2017_coco60-to-voc_it1_0.8','voc_2007_trainval_coco60-to-voc_it1_0.8_cocostyle')"

python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/mil_it1.yaml --start_iter 0 OUTPUT_DIR "output/coco60_to_voc/mil_it2" MODEL.WEIGHT "output/coco60_to_voc/mil_it1/model_final.pth" WEAK.CFG2 "output/coco60_to_voc/ocud_it2/config.yml"

python wsod/pseudo_label.py output/coco60_to_voc/mil_it2 coco60_train2017_21987 coco60_val2017_969 coco60-to-voc 2 0.8 | tee output/coco60_to_voc/mil_it2/pseudo.txt
(Optional) Distill a Faster RCNN

This retrains a Faster RCNN (R50C4) from the pseudo GT mined in step K=2.

python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file wsod/coco60_to_voc/distill_resnet50c4.yaml

Demo Result

Here are the VOC2007 test APs of the Demo above. Note that we report the mAP@IoU=0.5 under the VOC07 11-point metric in our paper, which is a bit lower than the area under PR curve.

mil_it2:

use_07_metric=True:
mAP: 0.5875
aeroplane       : 0.5851
bicycle         : 0.4720
bird            : 0.6876
boat            : 0.4561
bottle          : 0.4812
bus             : 0.7835
car             : 0.7515
cat             : 0.8028
chair           : 0.2962
cow             : 0.8010
diningtable     : 0.1465
dog             : 0.7994
horse           : 0.7006
motorbike       : 0.6749
person          : 0.5640
pottedplant     : 0.1211
sheep           : 0.6998
sofa            : 0.5831
train           : 0.7261
tvmonitor       : 0.6180

Due to randomness (and this code being a refactored version..), the numbers may vary from run to run and slightly differ from the paper's. But the difference should be rather limited. The example result here gives 58.75% mAP at K=2 which is higher than that in the paper.

Demo outputs are on Google Drive.

License

wsod_transfer inherits the MIT license from maskrcnn-benchmark. See LICENSE for additional details.