Foreground-Object-Search-Dataset-FOSD
Foreground-Object-Search-Dataset-FOSD copied to clipboard
[ICCV 2023] The datasets and code used in our paper "Foreground Object Search by Distilling Composite Image Feature", ICCV2023.
Foreground Object Search Dataset FOSD
This is the official repository for the following paper:
Foreground Object Search by Distilling Composite Image Feature [arXiv]
Bo Zhang, Jiacheng Sui, Li Niu
Accepted by ICCV 2023.
Our model has been integrated into our image composition toolbox libcom https://github.com/bcmi/libcom. Welcome to visit and try \(^▽^)/
Requirements
- See requirements.txt for other dependencies.
Data Preparing
-
Download Open-Images-v6 trainset from Open Images V6 - Download and unzip them. We recommend that you use FiftyOne to download the Open-Images-v6 dataset. After the dataset is downloaded, the data structure of Open-Images-v6 dataset should be as follows.
Open-Images-v6 ├── metadata ├── train │ ├── data │ │ ├── xxx.jpg │ │ ├── xxx.jpg │ │ ... │ │ │ └── labels │ └── masks │ │ ├── 0 │ │ ├── xxx.png │ │ ├── xxx.png │ │ ... │ │ ├── 1 │ │ ... │ │ │ ├── segmentations.csv │ ... -
Download S-FOSD annotations, R-FOSD annotations and background images of R-FOSD from Baidu disk (code: 3wvf) and save them to the appropriate location under the
datadirectory according to the data structure below. -
Generate backgrounds and foregrounds.
python prepare_data/fetch_data.py --open_images_dir <path/to/open/images>
The data structure is like this:
data
├── metadata
│ ├── classes.csv
│ └── category_embeddings.pkl
├── test
│ ├── bg_set1
│ │ ├── xxx.jpg
│ │ ├── xxx.jpg
│ │ ...
│ │
│ ├── bg_set2
│ │ ├── xxx.jpg
│ │ ├── xxx.jpg
│ │ ...
│ │
│ ├── fg
│ │ ├── xxx.jpg
│ │ ├── xxx.jpg
│ │ ...
│ └── labels
│ └── masks
│ │ ├── 0
│ │ ├── xxx.png
│ │ ├── xxx.png
│ │ ...
│ │ ├── 1
│ │ ...
│ │
│ ├── test_set1.json
│ ├── test_set2.json
│ └── segmentations.csv
│
└── train
├── bg
│ ├── xxx.jpg
│ ├── xxx.jpg
│ ...
│
├── fg
│ ├── xxx.jpg
│ ├── xxx.jpg
│ ...
│
└── labels
└── masks
│ ├── 0
│ ├── xxx.png
│ ├── xxx.png
│ ...
│ ├── 1
│ ...
│
├── train_sfosd.json
├── train_rfosd.json
├── category.json
├── number_per_category.csv
└── segmentations.csv
Pretrained Model
We provide the checkpoint (Baidu disk code: 7793) for the evaluation on S-FOSD dataset and checkpoint (Baidu disk code: 6kme) for testing on R-FOSD dataset. By default, we assume that the pretrained model is downloaded and saved to the directory checkpoints.
Testing
Evaluation on S-FOSD Dataset
python evaluate/evaluate.py --testOnSet1
Evaluation on R-FOSD Dataset
python evaluate/evaluate.py --testOnSet2
The evaluation results will be stored to the directory eval_results.
If you want to save top 20 results on R-FOSD, add --saveTop20 parameter. The top 20 results on R-FOSD will be stored to the directory top20 by default.
If you want to save the model's prediction scores on R-FOSD, add --saveScores parameter. The model scores on R-FOSD will be stored to the directory model_scores by default.
Training
Please download the pretrained teacher models from Baidu disk (code: 40a5) and save the model to directory checkpoints/teacher.
To train a new sfosd model, you can simply run:
.train/train_sfosd.sh
Similarly, train a new rfosd model by:
.train/train_rfosd.sh
FOS Score
Our model can be used to evaluate the compatibility between foreground and background in terms of geometry and semantics.
To launch the demo, you can run:
python demo/demo_ui.py
Here are three steps you can take to get a compatibility score for the foreground and the background.
-
Upload a background image in the left box of the first row
-
Click the left-top point and the right-bottom point of the bounding box in the right box of the first row
-
Upload a foreground image in the left box of the second row, then click 'run' button.
Other Resources
License
Both background and foreground images of S-FOSD belong to Open-Images. The background images of R-FOSD are collected from Internet and are licensed under a Creative Commons Attribution 4.0 License.