LearnSegmentation
LearnSegmentation copied to clipboard
Implement most common semantic segmentation algorithms
Semantic Segmentation
Semantic Segmentation it's basically about segmenting images based on the object types that composes the image.
What I plan to do
Implement most common semantic segmentation algorithms.
- FCN
- Deconvet
- Segnet
The idea is to give a clean code as reference, and have fun implementing those papers.
Example (24h training and testing with some Internet example)
Reference Papers
- Fully Convolutional Networks for Semantic Segmentation
- Learning Deconvolution Network for Semantic Segmentation
- SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
- Playing for Data: Ground Truth from Computer Games
Block Diagram
Basic Convolutional Autoencoder Architecture
Segnet
Deconvnet
Upsample(Unpool) + CONV
On some papers (Segnet, Deconvnet, FCN) you will observe that each "Upsample layer" is composed of UNPOOL+CONV.
Basic "Segnet/Deconvolution" Segmentation Architecture
Spatial Loss
Basically is a Spatial Multinomial Cross-Entropy that runs on each pixel of your output tensor, comparing with your label image.
On Tensorflow
with tf.name_scope("SPATIAL_SOFTMAX"):
loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=model_out,labels=tf.squeeze(labels_in, squeeze_dims=[3]),name="spatial_softmax")))
Improving results
One technique to improve the "bublish" effect from the segmentation network is to use Conditional Random Fields as a post-processing stage, which refines our segmentation by taking into account pure RGB features of image and probabilities produced by our network.
Testing
Just download this checkpoint compressed file, extract somewhere and change on the testing notebook the checkpoint path.
Training
First create your dataset, using the provided notebooks at ./src/notebook, I will add one example for the most common datasets. Then to actually start your training use:
python train.py train --input=/dataset_lmdb --gpu=0 --mem_frac=0.8 --learning_rate_init=0.001
python train_unsupervised.py train --input=/dataset_lmdb
Frameworks used
Datasets
- Virtual KITTI dataset
- MSCOCO
- MIT Scene Parsing Benchmark
- MIT Scene Parsing Development Kit
- Cambridge Camsec
- Human Part Segmentation
- ADE20K
- HuPBA-90 data set
- Sergio Caldera Samples
- Evangelos Samples
- Pedestrian Parsing via Deep Decompositional Network
References
- Semantic Segmentation chapter
- CS231n 2016 Lecture 13
- Pixelwise semantic labelling using deep networks
- Datageeks Data Day - Semantic Segmentation
- Fully Convolutional Networks for Semantic Segmentation talk
- Python Fire
- Conditional Random Field
- Conditional Random Field Presentation
- Deep Convolutional Neural Fields for Depth Estimation from a Single Image
- Computer Vision blog
Reference Projects
- FCN on tensorflow
- Enet Pytorch
- Segnet Tensorflow
- Deconvnet Tensorflow
- FCN, Segnet, UNet on Pytorch
- Pytorch for Semantic Segmentation
- Caffe original Segnet
- Caffe original FCN
- FCN on tensorflow 2
- FCN on tensorflow 3
- Lung Cancer Segmentation
- Blog explanation
- Segmentation Hangout
- Image Segmentation and CRF Notebook