sgd
sgd copied to clipboard
An R package for large scale estimation with stochastic gradient descent
Assumption using Fisher information to give variances, but we should also be able to check whether or not SGD is in the asymptotic region
How to handle feature selection, like `glmnet`? Should allow for something such as elastic net to do L1/L2 regularization. - [x] GLMs (explicit) - [x] GLMs (implicit) - [ ]...
I like this approach to avoid names clashing during implementation. For instance (arbitrarily chosen), see Stan's approach as in the file https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/sample.hpp.
- [ ] `subset`: a subset of data points; can be a parameter in sgd.control - [ ] `na.action`: how to deal when data has NA; can be a parameter...
This cannot be efficiently implemented in Rcpp. Rcpp cannot interface with the R function in any way other than calling it in R each time. Hence for now, we are...
Since we're doing scalable computation, we should work with scalable I/O, storage, etc. packages. That is, we'd like to be able to run SGD on data sets with memory larger...
Add unit tests to `tests/` folder. Follow similar structure as https://github.com/hadley/dplyr/tree/master/tests. See http://r-pkgs.had.co.nz/tests.html