Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
FeiElysia
Official Repository of Multi-Object Hallucination in Vision-Language Models
sled-group