sgd
sgd copied to clipboard
Plot diagnostics
These must allow one to specify multiple sgd objects to plot.
- [x] MSE
- [x] Classification error
- [ ] Evaluation of cost function
available x-axis for each of the above plots:
- [x] Runtime
- [x] log-Iteration
http://arxiv.org/abs/1505.02417 some examples of plot diagnostics we'd like (and also experiments to run)
Store 100 uniformly separated parameter estimates as default, in log space
For some idea of what the visuals look like, I quite like the stuff in http://arxiv.org/abs/1206.1106 for example (not particularly, just an arbitrary paper I chose). i.e. high resolution fonts, embedded LaTeX in the legend, etc.
picture of current progress
bugs/things to continue working on:
- sgd gives nonsensical prediction results (could be a result of bad learning rate)
- adagrad takes a disproportionately long time; not sure if this is just because it's O(d)
- sgd and adagrad give nonsensical cost function evaluations
Progress:
- was definitely just a problem of setting the hyperparameter
alpha
in the Xu's learning rate. This also still needs to be tweaked for implicit and ai-SGD - adagrad's bug, if it exists, lies in the C++ code. The time it takes is definitely proportional to the number of features. Covertype has 54 features, MNIST has 784, and Covertype is much faster as seen above
- I still need to find the bug in this one