Video2Text
Video2Text copied to clipboard
📺 An Encoder-Decoder Model for Sequence-to-Sequence learning: Video to Text
Video2Text
An Encoder-Decoder Model for Sequence-to-Sequence learning: Video to Text
Examples
Video | Text |
---|---|
![]() |
a man is driving down a road |
![]() |
a man is playing a guitar |
![]() |
a woman is cooking eggs in a bowl |
![]() |
a man eats pasta |
![]() |
a woman is slicing tofu |
![]() |
a person is mixing a tortilla |
![]() |
a group of people are dancing |
![]() |
a person is holding a dog |
Dataset
MSVD Dataset (Download)
1450 videos for training, 100 videos for testing
The input features are extracted by VGG(pretrained on the ImageNet).
Model Structures
Training Model
![](https://github.com/alvinbhou/Video2Text/raw/master/images/training_model.png)
Inference Model
Encoder
![](https://github.com/alvinbhou/Video2Text/raw/master/images/inference_encoder_model.png)
Encoder
![](https://github.com/alvinbhou/Video2Text/raw/master/images/inference_decoder_model.png)
How to use the code
video2text.py
usage: video2text.py [-h] --uid UID [--train_path TRAIN_PATH]
[--test_path TEST_PATH] [--learning_rate LEARNING_RATE]
[--batch_size BATCH_SIZE] [--epoch EPOCH] [--test]
Video to Text Model
optional arguments:
-h, --help show this help message and exit
--uid UID training uid
--train_path TRAIN_PATH
training data path
--test_path TEST_PATH
test data path
--learning_rate LEARNING_RATE
learning rate for training
--batch_size BATCH_SIZE
batch size for training
--epoch EPOCH epochs for training
--test use this flag for testing
Split the pre-extracted features of videos into training and testing directories. For training you may want to preprocess the data.
For testing, you should use the --test
flag, and here is a sample script to generate the testing results!
python video2text.py --uid best --test
This generates the video-to-text output at test_ouput.txt, and the average bleu score is 0.69009423.
For more information, check out the report.
References
Keras Blog: A ten-minute introduction to sequence-to-sequence learning in Keras
ADLxMLDS 2017 Fall Assignment 2
LICENSE
MIT