Deep Learning Glossary

Simple, opinionated explanations of various things encountered in Deep Learning / AI / ML.

Contributions welcome - there may be errors here!

Contests

ILSVRC = ImageNet Large Scale Visual Recognition Competition

The most prominent computer vision contest, using the largest data set of images (ImageNet). The progress in the classification task has brought CNNs to dominate the field of computer vision.

Year	Model	Top-5 Error	Layers	Paper
2012	AlexNet	17.0 %	8	http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks
2013	ZFNet	17.0 %	8	http://arxiv.org/abs/1311.2901
2014	VGG-19	8.43%	19	http://arxiv.org/abs/1409.1556
2014	GoogLeNet / Inception	7.89%	22	http://arxiv.org/abs/1409.4842
2015	Inception v3
2015	ResNet	4.49%	152	http://arxiv.org/abs/1512.03385

Techniques

Stochastic Gradient Descent (SGD)

The original and simpliest back propigation optimization algorithm. Still used everywhere!

SGD with Momentum

A simple and often used improvement to SGD - follow past gradients with some weight.

Adagrad

Another optimizer

Adam Optimizer

Kingma & Ba, 2015

FTRL-proximal algorithm, Follow-the-regularized-leader

Google, 2013

Rectified Linear Unit (ReLU)

Rectified linear unit is a common activation function which was first proved useful in AlexNet. Recommend over sigmoid activiation.

relu(x) = max(x, 0)

Nair & Hinton, 2010

Parametric Rectified Linear Unit (PReLU)

http://arxiv.org/pdf/1502.01852v1

Leaky Rectified Linear Unit

Sometimes inputs to an ReLU can get pushed way negative, in which case the neuron can get permanently turned off (the exploding/vanishing gradients problem). The leaky ReLU is used to combat this problem by having a small slope on the negative side.

def leaky_relu(x): 
  if 0 <= x:
    return x
  else:
    return 0.01 * x

Batch Normialization (BN)

Normalizing the inputs to each activation function can dramatically speed up learning.

Ioffe & Szegedy, 2015

Dropout

Introduced in AlexNet? Randomly zero out 50% of inputs during the forward pass. Simple regularizer.

LSTM = Long Short Term Memory

A type of RNN that solves the exploding/vanishing gradient problem.

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

This paper is a great exploration of variations. Concludes that vanilla LSTM is best.

Originally invented by Hochreiter & Schmidhuber, 1997

Models

AlexNet

Winner of ILSVRC 2012. Made a huge jump in accuracy using CNN. Dropout, ReLUs.

http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks

VGG-16 / VGG-19 / OxfordNet

Close second place winner in ILSVRC 2014. Very simple CNN architecure using only 3x3 convolutions, max pooling, ReLUs, dropout

Neural Random-Access Machine (NRAM)

http://arxiv.org/pdf/1511.06392v1

Grid-LSTM

(Kalchbrenner et al., 2015)

Neural Turing Machine (NTM)

(Graves et al., 2014)

Deep Q Network (DQN)

(Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra Martin Riedmiller)

Software

Caffe

TensorFlow

Theano

Torch

CuDNN

MxNet

Data sets

CIFAR-10

60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

https://www.cs.toronto.edu/~kriz/cifar.html

ImageNet

MNIST

Handwritten digits. 28x28 images. 60,000 training images and 10,000 testing images

IAM Handwriting Database

http://www.iam.unibe.ch/fki/databases/iam-handwriting-database

Famously used in Graves's handwriting generation RNN: http://www.cs.toronto.edu/~graves/handwriting.html

http://yann.lecun.com/exdb/mnist/

TIMIT Speech corpus

(Garofolo et al., 1993)

deep_learning_glossary
deep_learning_glossary copied to clipboard

Metadata

Deep Learning Glossary

Contests

ILSVRC = ImageNet Large Scale Visual Recognition Competition

Techniques

Stochastic Gradient Descent (SGD)

SGD with Momentum

Adagrad

Adam Optimizer

FTRL-proximal algorithm, Follow-the-regularized-leader

Rectified Linear Unit (ReLU)

Parametric Rectified Linear Unit (PReLU)

Leaky Rectified Linear Unit

Batch Normialization (BN)

Dropout

LSTM = Long Short Term Memory

Models

AlexNet

VGG-16 / VGG-19 / OxfordNet

Neural Random-Access Machine (NRAM)

Grid-LSTM

Neural Turing Machine (NTM)

Deep Q Network (DQN)

Software

Caffe

TensorFlow

Theano

Torch

CuDNN

MxNet

Data sets

CIFAR-10

ImageNet

MNIST

IAM Handwriting Database

TIMIT Speech corpus

← Metadata

Owner

Metadata

deep_learning_glossary deep_learning_glossary copied to clipboard

Metadata

Deep Learning Glossary

Contests

ILSVRC = ImageNet Large Scale Visual Recognition Competition

Techniques

Stochastic Gradient Descent (SGD)

SGD with Momentum

Adagrad

Adam Optimizer

FTRL-proximal algorithm, Follow-the-regularized-leader

Rectified Linear Unit (ReLU)

Parametric Rectified Linear Unit (PReLU)

Leaky Rectified Linear Unit

Batch Normialization (BN)

Dropout

LSTM = Long Short Term Memory

Models

AlexNet

VGG-16 / VGG-19 / OxfordNet

Neural Random-Access Machine (NRAM)

Grid-LSTM

Neural Turing Machine (NTM)

Deep Q Network (DQN)

Software

Caffe

TensorFlow

Theano

Torch

CuDNN

MxNet

Data sets

CIFAR-10

ImageNet

MNIST

IAM Handwriting Database

TIMIT Speech corpus

← Metadata

Owner

Metadata

deep_learning_glossary
deep_learning_glossary copied to clipboard