IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

This repository is the official implementation of IMProv introduced in the paper:

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

Jiarui Xu, Yossi Gandelsman, Amir Bar, Jianwei Yang, Jianfeng Gao, Trevor Darrell, Xiaolong Wang

teaser

Visual Results

teaser More in project page: https://jerryxu.net/IMProv/

Citation

If you find our work useful in your research, please cite:

@article{xu2023improv,
  author    = {Xu, Jiarui and Gandelsman, Yossi and Bar, Amir and Yang, Jianwei and Gao, Jianfeng and Darrell, Trevor and Wang, Xiaolong},
  title     = {{IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks}},
  journal   = {arXiv preprint arXiv: 2312.01771},
  year      = {2023},
}

:label: TODO

[x] Release inference code and demo.
[x] Release checkpoints.
[ ] Release S2CV dataset.
[ ] Release training codes.

:hammer_and_wrench: Environment Setup

Install dependencies by running:

conda install pytorch=2.0 torchvision pytorch-cuda=11.7 -c pytorch -c nvidia
git clone https://github.com/xvjiarui/IMProv.git
pip install -e IMProv

:arrow_forward: Demo

python demo/demo.py --output demo/output.png

The output is saved in demo/output.png.

IMProv
IMProv copied to clipboard

Metadata

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

Visual Results

Links

Citation

:label: TODO

:hammer_and_wrench: Environment Setup

:arrow_forward: Demo

← Metadata

Owner

Metadata

IMProv IMProv copied to clipboard

Metadata

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks

Visual Results

Links

Citation

:label: TODO

:hammer_and_wrench: Environment Setup

:arrow_forward: Demo

← Metadata

Owner

Metadata

IMProv
IMProv copied to clipboard