machineJS
machineJS copied to clipboard
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
data-formatter is going to start taking the log of regression values in most cases. when making final predictions, un-log the values when making stage 0 predictions, modify the output column...
this is a far future idea. when we have disagreement between different algorithms, that is a difficult case. when we have disagreement, mark it as such. aggregate all the samples...
if true, we will modify the process to be simpler. the second round of machineJS, run by ensembler, will pick the algorithm with the best score, and just use that,...
https://github.com/yandex/rep/ if it works as well as it says it does, it could provide a bunch of awesome new algos to include! and it appears to be under active development.
related to this thread: http://stackoverflow.com/questions/18306416/adaboostclassifier-with-different-base-learners xgboost jumps immediately to mind...
have it check the number of trained classifiers find the best score for each classifier make sure the best scores beat the benchmarks we've already set
right now they're only for regression data
tiny is just logistically- does the process obviously break down anywhere? this lets us quickly check changes for things like syntax errors and such robust trains against the full datasets,...
slide 18 says vw, libffm and nn are order dependent, so shuffling the data will give you better results: http://www.slideshare.net/odsc/owen-zhangopen-sourcetoolsanddscompetitions1
have it make a rough guess to shoot for 8 hours of training time. but make that training time variable super obvious, so people can tweak it themselves. obviously, this...