CaCao
CaCao copied to clipboard
This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)
CaCao
This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)
Complete code for CaCao and boosted SGG
Here we provide sample code for CaCao boosting SGG dataset in standard setting and open-world setting.
Enhanced fine-grained predicates for VG
Download the enhanced dataset for VG training, you can use this Google drive link.
Running Script Tutorial
python adaptive_cluster.py # obtain initialized clusters for CaCao
python fine_grained_mapping.py # establish the mapping from open-world boosted data to target predicates for enhancement
python cross_modal_tuning.py # obtain cross-modal prompt tuning models for better predicate boosting
python fine_grained_predicate_boosting.py # enhance the existing SGG dataset with our CaCao model in <pre_trained_visually_prompted_model>
Quantitative Analysis
Qualitative Analysis
Predicate Boosting
Predicate Prediction Distribution
Acknowledgement
The SGG part code is implemented based on Scene-Graph-Benchmark.pytorch, FGPL, and SSRCNN(One-Stage). Thanks for their great works!
📜 Citation
If you find this work useful for your research, please cite our paper and star our git repo:
@inproceedings{yu2023visually,
title={Visually-prompted language model for fine-grained scene graph generation in an open world},
author={Yu, Qifan and Li, Juncheng and Wu, Yu and Tang, Siliang and Ji, Wei and Zhuang, Yueting},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={21560--21571},
year={2023}
}