awesome-diverse-captioning
awesome-diverse-captioning copied to clipboard
Some papers about *diverse* image (a few videos) captioning
Awesome-Diverse-Captioning
A curated list of diverse image (mainly, sometimes video, and even textual) captioning. Note that broadly, visual diverse captioning includes diverse caption set (one to many) and distinctive caption (for one single caption) with/without explicit controllable signs. Dense video captioning is excluded since it has become a subarea of video captioning. More detailed tags will be updated later. Feel free to inform me if you have any comment.
Paper List
2022
-
A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation
Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata (Google)
ACL 2022
[partial code]conditional
metrics
decoding sampling
-
Hierarchical Sketch Induction for Paraphrase Generation
Tom Hosking, Hao Tang, Mirella Lapata
ACL 2022
controllable
VAEs
-
Generating Scientific Definitions with Controllable Complexity
Tal August, Katharina Reinecke, Noah A. Smith
ACL 2022
controllable
definition modeling
-
CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation
Pei Ke, Hao Zhou, Yankai Lin, Peng Li, Jie Zhou, Xiaoyan Zhu, Minlie Huang
ACL 2022
controllable
metric
-
Controllable Dictionary Example Generation: Generating Example Sentences for Specific Targeted Audiences
Xingwei He, Siu Ming Yiu
ACL 2022
controllable
-
Show, Tell and Rephrase: Diverse Video Captioning via Two-Stage Progressive Training
Zhu Liu, Teng Wang, Jinrui Zhang, Feng Zheng, Wenhao Jiang, Ke Lu
TMM 2022
diversity
metric
2021
-
Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles
Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu
CVPR 2021
controllable
-
Towards Accurate Text-Based Image Captioning With Content Diversity Exploration
Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu
CVPR 2021
-
Open-Book Video Captioning With Retrieve-Copy-Generate Network
Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu
CVPR 2021
-
Question-controlled Text-aware Image Captioning
Anwen Hu, Shizhe Chen, Qin Jin
ACMMM 2021
controllable
-
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning (Short)
Fenglin Liu, Xuancheng Ren, Xian Wu, Bang Yang, Shen Ge, Yuexian Zou, Xu Sun
ACL 2021
controllable
-
Control Image Captioning Spatially and Temporally
Kun Yan, Lei Ji, Huaishao Luok, Ming Zhou, Nan Duan, Shuai Ma
ACL 2021
controllable (mouse traces)
-
Understanding Guided Image Captioning Performance across Domains
Edwin G. Ng, Bo Pang, Piyush Sharma, Radu Soricut
CoNLL 2021
controllable (semantic label)
-
Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning
Jiahe Shi, Yali Li, Shengjin Wang
ICCV 2021
controllable
2020
-
LNFMM: Latent Normalizing Flows for Many-to-Many Cross Domain Mappings
Shweta Mahajan, Iryna Gurevych, Stefan Roth
ICLR 2020
[pytorch-code] [openreview] -
Diverse Image Captioning with Context-Object Split Latent Spaces
Shweta Mahajan and Stefan Roth
NIPS 2020
[pytorch-code] [review]diversity
-
On Diversity in Image Captioning: Metrics and Methods
Qingzhong Wang and Jia Wan and Antoni B. Chan
TPAMI 2020
[pytorch-code]survey
diversity
metrics
-
Improving Image Captioning Evaluation by Considering Inter References Variance
Yanzhi Yi and Hangyu Deng and Jinglu Hu
ACL 2020
[code]metrics
-
Better Captioning with Sequence-Level Exploration
Jia Chen and Qin Jin
CVPR 2020
[video]diversity
2019
-
POS: Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech
Aditya Deshpande, Jyoti Aneja, Liwei Wang, Alexander Schwing, David Forsyth
CVPR 2019
.diversity
controllable
-
Generating Diverse and Descriptive Image Captions Using Visual Paraphrases
Lixin Liu, Jiajun Tang, Xiaojun Wan, Zongming Guo
ICCV 2019
descriptiveness
-
Controllable Video Captioning With POS Sequence Guidance Based on Gated Fusion Network
Bairui Wang, Lin Ma, Wei Zhang, Wenhao Jiang, Jingwen Wang, Wei Liu
ICCV 2019
[pytorch-code]controllable
-
VSSI-cap: Variational Structured Semantic Inference for Diverse Image Captioning
Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang
NIPS 2019
diversity
VAE
discriminativeness
-
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2019
[pytorch-code]controllable
-
Intention Oriented Image Captions with Guiding Objects
Yue Zheng, Yali Li and Shengjin Wang
CVPR 2019
[unfinishe-code]controllable (object labels)
-
Towards Diverse and Accurate Image Captions via Reinforcing Determinantal Point Process
Wang, Qingzhong and Chan, Antoni B
Arxiv 2019
[pytorch-code] -
Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation
Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Jingjing Li, Yang Yang
ACM MM 2019
-
Engaging Image Captioning via Personality
Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston
CVPR 2019
[Openreview for ICLR 19] -
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing
ICCV 2019
diversity
VAE
-
Describing Like Humans: On Diversity in Image Captioning
Qingzhong Wang, Antoni B. Chan
CVPR 2019
-
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text
Longteng Guo, Jing Liu, Peng Yao, Jiangwei Li, Hanqing Lu
CVPR 2019
2018
-
GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints
Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Jinsong Su
CVPR 2018
-
A Neural Compositional Paradigm for Image Captioning
Bo Dai, Sanja Fidler, Dahua Lin
NIPS 2018
[lua-code] [open review] -
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee and Alexander G. Schwing
ECCV 2018
[pytorch-code] -
Categorizing Concepts With Basic Level for Vision-to-Language
Hanzhang Wang, Hanli Wang, Kaisheng Xu
CVPR 2018
2017
-
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele
ICCV 2017
diversity
GAN
-
Towards Diverse and Natural Image Descriptions via a Conditional GAN
Bo Dai, Sanja Fidler, Raquel Urtasun, Dahua Lin
ICCV 2017
GAN
[video] -
Liwei Wang, Alexander Schwing, Svetlana Lazebnik
NeurIPS 2017
[Review]diversity
VAE
-
Weakly Supervised Dense Video Captioning
Zhiqiang Shen, Jianguo Li, Zhou Su, Minjun Li, Yurong Chen, Yu-Gang Jiang, Xiangyang Xue
CVPR 2017
VAE
-
From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning
Jingkuan Song, Yuyu Guo, Lianli Gao, Xuelong Li, Alan Hanjalic, Heng Tao Shen
IEEE Trans Neural Netw Learn Syst 2017
VAE
2016
-
Diverse Image Captioning via GroupTalk
Zhuhao Wang, Fei Wu, Weiming Lu, Jun Xiao, Xi Li, Zitong Zhang, Yueting Zhuang
IJCAI 2016
-
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models
Ashwin K. Vijayakumar, Michael Cogswell, Ramprasaath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, Dhruv Batra
CoRR 2016
[lua-code] [demo] [openreview from ICLR'17]
2015
-
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille
ICLR 2015
[code:TF-mRNN] [code:mRNN-CR]diversity
consensus re-ranking
Main Reference
https://openaccess.thecvf.com/menu
https://openreview.net/