tensorflow2_model_zoo
tensorflow2_model_zoo copied to clipboard
explore tensorflow 2
Tensorflow 2.0 playground

~~This is the repo for working with nerual networks with Tensorflow 2.0 beta.~~ Tensorflow 2.0 is out now~
CPU only version installation:
pip install tensorflow
GPU version please consult GPU Guide
Contents
Stand alone examples
dataset == MNIST- subclass model with eager execution mode in custom loop
- above +
@tf.functiondecorator - above with
tf.keras.Sequentialmodel - above with
tf.keras.fitapi - metric learning with additive angular margin loss
Reference: ArcFace: Additive Angular Margin Loss for Deep Face Recognition
dataset == cifar10- simple cnn model
- above with mixup
Reference: mixup: Beyond empirical risk minimization - above with ict
Reference: Interpolation Consistency Training for Semi-Supervised Learning
dataset == titanic- unsupervised categorical feature embedding
Tutorial Notebooks - I call this 'Learn Tensorflow 2.0 the Hard Way'
- Tensors Variables Operations and AutoDiff
- AutoGraph
- Custom Model and Layer
- Optimizers
- Loss Function
Losses
| Loss | Reference | Year |
|---|---|---|
| ApproxNDCG | A General Approximation Framework for Direct Optimization of Information Retrieval Measures | 2008 |
| Smooth L1 Loss | Fast R-CNN | 2015 |
| ArcFace | ArcFace: Additive Angular Margin Loss for Deep Face Recognition | 2018 |
Optimizers
| Optimizer | Reference | Year |
|---|---|---|
| LARS | Large Batch Training of Convolutional Networks | 2017 |
| SWA | Averaging Weights Leads to Wider Optima and Better Generalization | 2018 |
| Yogi | Adaptive Methods for Nonconvex Optimization | 2018 |
| RAdam | On the Variance of the Adaptive Learning Rate and Beyond | 2019 |
| LAMB | Large Batch Optimization for Deep Learning: Training BERT in 76 minutes | 2019 |
| Lookahead | Lookahead Optimizer: k steps forward, 1 step back | 2019 |
Convolutional Neural Networks
| Model | Reference | Year |
|---|---|---|
| AlexNet | ImageNet Classification with Deep Convolutional Neural Networks | 2012 |
| VGG | Very Deep Convolutional Networks for Large-Scale Image Recognition | 2014 |
| GoogleNet | Going Deeper with Convolutions | 2015 |
| ResNet | Deep Residual Learning for Image Recognition | 2016 |
| WideResNet | Wide Residual Networks | 2016 |
| SqueezeNet | SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size | 2016 |
| DenseNet | Densely Connected Convolutional Networks | 2017 |
| ResNeXt | Aggregated Residual Transformations for Deep Neural Networks | 2017 |
| SEResNeXt | Squeeze-and-Excitation Networks | 2018 |
| MobileNetV2 | MobileNetV2: Inverted Residuals and Linear Bottlenecks | 2018 |
| ShuffleNetV2 | ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design | 2018 |
| MnasNet | MnasNet: Platform-Aware Neural Architecture Search for Mobile | 2019 |
Sequence Models
| Model | Reference | Year |
|---|---|---|
| Transformer | Attention Is All You Need | 2017 |
Graph Neural Networks
| Model | Reference | Year |
|---|---|---|
| MPNN | Neural Message Passing for Quantum Chemistry | 2017 |
Factorization Machines
See the benchmark result on how to use the benchmark runner to train these models on criteo dataset.
Bag of Tricks
- Learning rate finder - Train the model over small number of iterations with increasing learning rates. Plot the loss values against the learning rates to identify sweet spot. Reference
- Monte Carlo Dropout - Kind of TTA procedure, that averages the predictions from model with dropout activated in inference mode. Reference