stylenet icon indicating copy to clipboard operation
stylenet copied to clipboard

A pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"

StyleNet: Generating Attractive Visual Captions with Styles

* under development

StyleNet is a novel framework to address the task of generating attractive captions for images and videos with different styles. A novel model component, named factored LSTM is used in StyleNet, which automatically distills the style factors in the monolingual text corpus.

framework Imgur

examples of generated captions Imgur

Description

  • A pytorch implemention of StyleNet
  • Author: Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, Li Deng
  • Published in: Computer Vision and Pattern Recognition (CVPR), 2017
  • URL: https://www.microsoft.com/en-us/research/wp-content/uploads/2017/06/Generating-Attractive-Visual-Captions-with-Styles.pdf
  • Dataset: https://zhegan27.github.io/Paper.html
  • Slideshare: https://www.slideshare.net/DeepLearningJP2016/dl-hacks-stylenet-generating-attractive-visual-captions-with-styles
  • written by Kota Kakiuchi

Requirement

  • python 3.5.3
  • pytorch 0.2.0
  • torchvision 0.1.9
  • numpy 1.13.3
  • scikit-image 0.13.1
  • nltk 3.2.5