cruise
cruise copied to clipboard
Introduce momentum to DNN
A lot of researches articulate that momentum may accelerate job convergence of DNN. We should introduce momentum to our DNN codebase.
- (Apparently) the first paper to propose the momentum method: Polyak, "Some methods of speeding up the convergence of iteration methods", 1964.
- A good comparison of the classical momentum method and the Nesterov version: http://www.cs.toronto.edu/~hinton/absps/momentum.pdf, Section 2
- Very short version: http://cs231n.github.io/neural-networks-3/#sgd