Nicolas Pinto issues

Results 27 issues of


                                            Nicolas Pinto

Parameter for (exponentially) moving average

Working feedback

feedback should be restored (btw, it was used in LeCun's NIPS'11 optimization challenge)

step_size0 should be cross validated on a small subset of the data

The constant η0 is determined by performing preliminary experiments on a data subsample. http://leon.bottou.org/projects/sgd We could also have a `asgd.tune_...()` methods to "tune" speed and accuracy (here step_size0 would be...

Sparsity trick on weights update

see sparsity trick from bottou

Use η0 / (1 + λ η0 t)^0.75 by default instead of ...^2/3.

The learning rate has the form η0 / (1 + λ η0 t)^0.75 where λ is the regularization constant. See: http://leon.bottou.org/projects/sgd

Compute X_mean and X_std in fit().

"sphere" the data and merge in the weights

ENH: warm-up period with exponential moving asgd and switch from sgd to asgd when empirical loss gets higher

The idea is to boost the performance by "disabling" the averaging until it gets useful. start with exp_moving_asgd_step_size=1e-2 ?

Online LOOE-like for hyper-parameters selection.

Multiple (sgd_step_size0, l2_regularization) could be given and `*fit()` methods could use BLAS Level-3 operations when appropriate to allow for more data re-use and speed up the computation. This is confusing......

Weight updates only if margin constraint violated.

To decrease communication and speed up convergence, we should have an option (default=True) to only update weights when margin constraints have been violated: e.g.: Line #66 should move up (to...

kwargs for mini_batch settings

It would be useful to have the possibility of using "mini_batches" to get better estimations of the gradients (see https://github.com/npinto/asgd/blob/master/asgd/naive_asgd.py#L60). Since we'll be using BLAS, etc. this parameter could possibly...