Ryan Bressler
Ryan Bressler
It doesn't produce good results using the current attempt to normalize the data and run it through an adaboost like function. Should implement a simple cutoff based method as has...
Currently we pass around a BestSplitAllocs object to avoid repeated allocations for each learning go routine. It might make more sense to call this object something like Learner or LearnerThread...
In cross validation studies it would be useful to specify feature sets and cases to use without having to slice up the fm (note: loading the same massive fm may...
Add a new numeric feature type (and detect such features on data load) that uses a pre stored list of all the distinct values instead of sorting on each split....
While the core gini impurity and l2 regression are quire fast some of the more esoteric ones could be sped up quite a bit.
We should implement the one sided impurity metic described in this paper: http://arxiv.org/pdf/0811.1645.pdf May need to invert it so that we our code which minimizes things will work.