DallEval
DallEval copied to clipboard
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
- Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (UNC Chapel Hill)
- Paper
data:image/s3,"s3://crabby-images/33f2c/33f2c3f94586ad2e77caf1637beab7c308e83df3" alt="teaser image"
Visual Reasoning
data:image/s3,"s3://crabby-images/d5676/d56766f6f042baa6fe89d800466150cde5f6b1f7" alt="skill image"
Please see ./paintskills for our DETR-based visual reasoning skill evaluation.
(Optional) Please see https://github.com/aszala/PaintSkills-Simulator for our 3D Simulator implementation.
Image Quality & Image-Text Alignment
data:image/s3,"s3://crabby-images/d484d/d484d99c5a578e9a237ef7ad9ec019a4f57658fb" alt="alignment and quality image"
Please see ./quality for our image quaity evaluation based on FID score.
Please see ./retrieval for our image-text alignment evaluation with CLIP-based R-precision.
Please see ./captioning for our image-text alignment evaluation with VL-T5 captioning.
Social Bias
data:image/s3,"s3://crabby-images/7d1cf/7d1cf26ecdeee2bbc6f5c80fdea5493652921698" alt="bias exp image"
Please see ./biases for our CLIP-based social (gender and racial) bias evaluation.
Models
We provide training and inference scripts for DALLE-small (DALLE-pytorch), ruDALL-E XL, minDALL-E, and X-LXMERT.
Acknowledgments
We thank the developers of DETR, DALLE-pytorch, ruDALL-E, minDALL-E, and X-LXMERT, for their public code release.
Reference
Please cite our paper if you use our dataset in your works:
@article{Cho2022DallEval,
title = {DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers},
author = {Jaemin Cho and Abhay Zala and Mohit Bansal},
year = {2022},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
eprint = {2202.04053}
}