captioning topic
X-Trans2Cap
[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
AoA-pytorch
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
camel
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
ebu-tt-live-toolkit
Toolkit for supporting the EBU-TT Live specification
vistext
VisText is a benchmark dataset for semantically rich chart captioning.
VidSitu
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
dcase-2020-baseline
Audio captioning baseline system for DCASE 2020 challenge.
caption-by-committee
Using LLMs and pre-trained caption models for super-human performance on image captioning.
video-chat
Sample app to display live captioning to a WebRTC video session with the Deepgram API.