msvd topic
video-captioning-models-in-Pytorch
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
Video-Description-with-Spatial-Temporal-Attention
[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"
CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
delving-deeper-into-the-decoder-for-video-captioning
Source code for Delving Deeper into the Decoder for Video Captioning
Semantics-AssistedVideoCaptioning
Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy
VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
video_captioning_datasets
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
video_features_extractor
Python implementation of extraction of several visual features representations from videos
visual_syntactic_embedding_video_captioning
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
attentive_specialized_network_video_captioning
Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*