sgd
sgd copied to clipboard
Implement other sgd methods
not sure whether we should include these or not. while nice for comprehensiveness, if our ai-SGD is ultimately superior to them then it would only confuse users to include "inadmissible" estimators as an option
- [x] standard
- [x] implicit
- [x] Classical momentum
- [x] Nesterov momentum
- [x] averaging
- [ ] SAG
- [ ] SVRG
- [ ] prox-SAG
- [ ] prox-SVRG
by inadmissible you refer to SAG/SVRG or to other non-averaged variants of SGD?
inadmissable as in: if we show ai-SGD is better than prox-SVRG and prox-SAG, and that aSGD is better than SAG and SVRG, then it would not make much sense to include SAG/SVRG if there is no advantage to using them
I think we will not be able to prove that AISGD > SAG/SVRG.
However, I also think we should not include SAG/SVRG. Their implementation is too nuanced and the authors have not made any software public as far as I know. The burden should be on them. :)
I added some documentation in the wiki page.