PapersAnalysis
PapersAnalysis copied to clipboard
Deep Learning - Basic Elements
Overview
Basic Elements about Deep Learning
Deep Learning vs Machine Learning
Similarities
- In both cases the goal is to train a model hence at high level it deals with the same tools: dataset, loss function, optimization, ...
Differencies
-
Machine Learning Models have typically much less learning capacity of Deep Learning Models
-
The reason for this is their input domain is typically low dimensional (e.g. scalar time series, not images, videos, ...) and that's because
- or they aim to work on that specific domain
- or they rely on manually engineered features
- feature extraction is essentially a projection into a lower dimensional space
-
Deep Learning Models target high dimensional spaces as their input (e.g. images, videos) and they do not need any manually engineered feature
-
In fact they automatically learn the most appropriate representation for their data
- most appropriate, for them to achieve their goal which is essentially minimizing the loss function values
-
This automatic representation learning is the result of the inductive bias related to the architecture consisting of an hierarchy of layers : this choice forces the network to learn a hierarchy of layer specific representations so that each layer representation depends on the previous layer one and the first layer depends on the high dimensional input representation
-
This requires the Deep Learning Models to have a much higher learning capacity (as they need to learn both the hierarchical representation and how to solve their problem) and it creates a lot of additional complexities with respect to machine learning models like
- loss function landscape becomes highly non-linear hence optimization becomes a hard problem (this is approached training the network)
- higher learning capacity increases significantly the chances of overfitting, hence the model becomes very data hungry and specific techniques for data augmentation, regularization, ... are required to achieve good results
RNN
- The Recurrent Neural Network is a kind of NN whose topology changes according to the specific perspective it is observed
- in a time agnostic perspective it has a recurrent topology
- in a time specific perspective it has a directed topology (directed edges) as each new state depends on the previous state