mscoco-dataset topic
bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
coco-caption
Adds SPICE metric to coco-caption evaluation server codes
mobile-segmentation
Real-time semantic image segmentation on mobile devices
easy-faster-rcnn.pytorch
An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
pvse
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
easy-fpn.pytorch
An easy implementation of FPN (https://arxiv.org/pdf/1612.03144.pdf) in PyTorch.
SegCaps
A Clone version from Original SegCaps source code with enhancements on MS COCO dataset.
GAN
We aim to generate realistic images from text descriptions using GAN architecture. The network that we have designed is used for image generation for two datasets: MSCOCO and CUBS.
Fine-Grained-Image-Captioning
The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”
self-critical
PyTorch implementation of paper: "Self-critical Sequence Training for Image Captioning"