CloudForest
CloudForest copied to clipboard
Optimization for numerical features with few values.
Add a new numeric feature type (and detect such features on data load) that uses a pre stored list of all the distinct values instead of sorting on each split.
I suspect this will be faster for sparse features, or feature types like hamming scat that have mostly one value and for ordinal features with few values.
It could also support optimized mode finding for ordinal regression.