IMProv
IMProv copied to clipboard
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
This repository is the official implementation of IMProv introduced in the paper:
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
Jiarui Xu, Yossi Gandelsman, Amir Bar, Jianwei Yang, Jianfeng Gao, Trevor Darrell, Xiaolong Wang
Visual Results
More in project page: https://jerryxu.net/IMProv/
Links
- Jiarui Xu's Project Page (with additional visual results)
- HuggingFace 🤗 Model
- Run the demo on Google Colab:
- arXiv Page
Citation
If you find our work useful in your research, please cite:
@article{xu2023improv,
author = {Xu, Jiarui and Gandelsman, Yossi and Bar, Amir and Yang, Jianwei and Gao, Jianfeng and Darrell, Trevor and Wang, Xiaolong},
title = {{IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks}},
journal = {arXiv preprint arXiv: 2312.01771},
year = {2023},
}
:label: TODO
- [x] Release inference code and demo.
- [x] Release checkpoints.
- [ ] Release S2CV dataset.
- [ ] Release training codes.
:hammer_and_wrench: Environment Setup
Install dependencies by running:
conda install pytorch=2.0 torchvision pytorch-cuda=11.7 -c pytorch -c nvidia
git clone https://github.com/xvjiarui/IMProv.git
pip install -e IMProv
:arrow_forward: Demo
python demo/demo.py --output demo/output.png
The output is saved in demo/output.png
.