Dustin Tran

Results 118 comments of Dustin Tran

inadmissable as in: if we show ai-SGD is better than prox-SVRG and prox-SAG, and that aSGD is better than SAG and SVRG, then it would not make much sense to...

I added some documentation in the [wiki page](../wiki/Stochastic-gradient-methods).

From intuition, I believe this is because our approximation to the upper bound on the minimal eigenvalue of the Fisher information is too noisy, and thus oftentimes not an upper...

It also makes sense to profile the code and see how long running the validity check takes On May 9, 2015 4:22 PM, "Panos Toulis" [email protected] wrote: > as long...

Yup, in the same way they have `stan::mcmc::sample`, we should have something such as `sgd::glm_experiment`. We should also note use the `arma` namespace but `::` into it.

Look into how `optim()` and `gmm()` do this, since they're quite fast but also allow users to specify a generic function.

- Check MSE with glm; fail if not close enough

4) needs to be improved by adding new convergence criterions, c.f., #54, so that it only stops when less than 1e-2. This way we ensure that, say, MSE(SGD) < 1e-2...

http://arxiv.org/abs/1505.02417 some examples of plot diagnostics we'd like (and also experiments to run)

Store 100 uniformly separated parameter estimates as default, in log space