Knet.jl
Knet.jl copied to clipboard
training API improvements
- [ ] SGD learning rate scheduler
- [ ] Global gradient clip
- [ ] Default to override param.opt rather than keep
See also #564
In the xnornet paper (https://arxiv.org/pdf/1603.05279.pdf pp. 7), LR update follows parameter update every iteration. We can generalize this to any optimizer update (maybe we want to change more than just LR) and run it every iteration. What inputs would such an update need? Iteration number? Loss values?