adam-optimizer topic
gradient-descent
A research project on enhancing gradient optimization methods
Crowded-Valley---Results
This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"
CS231n
PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"
RAdam
On the Variance of the Adaptive Learning Rate and Beyond
deepnet
Educational deep learning library in plain Numpy.
CS-F425_Deep-Learning
CS F425 Deep Learning course at BITS Pilani (Goa Campus)
padam-tensorflow
Reproducing the paper "PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks" for the ICLR 2019 Reproducibility Challenge
SVHN-CNN
Google Street View House Number(SVHN) Dataset, and classifying them through CNN
AdasOptimizer
ADAS is short for Adaptive Step Size, it's an optimizer that unlike other optimizers that just normalize the derivative, it fine-tunes the step size, truly making step size scheduling obsolete, achiev...
Hypergradient_variants
Improved Hypergradient optimizers, providing better generalization and faster convergence.