Ryan Bressler
Ryan Bressler
For feature selection: https://sites.google.com/site/houtaodeng/rrf https://fd03118b-a-62cb3a1a-s-sites.googlegroups.com/site/houtaodeng/publications/FSRegularizedTrees.pdf?attachauth=ANoY7cov-DsoVWCKZ8luuwvvztHuEVtf2OhCNd4dMswUaB7yiaAe_GvpxUQitpLc49yA-4FrNYlatRhMKI2p4CsIqITofbk8u8iynCDqTvOmtog23-j3R8Pgak4E1Ie4jN1h3OBVtphWR7pyCXul6v1xeT-sl241JWT3Wylf7K24ta4h2AECL-c0GtwjQZBORg0drHIUuFu0VsRyJ_M-CVa3XcbT6afzy3DDhLyJxSO7cdZVg1mMEnM%3D&attredirects=0
http://www.ncbi.nlm.nih.gov/pubmed/16986543 Would require a PCA/Eigensolver
http://jmlr.org/proceedings/papers/v25/madani12/madani12.pdf On Using Nearly-Independent Feature Families for High Precision and Condence Omid Madani Manfred Georg David A. Ross This could be interesting for classification from different derived genomic features...
Extremely randomixed trees. This may require a new feature/target interface depending on how much overlap there is in parameters with the existing splitting code. It should be easy to generate...
There is a great review of techniques for manifold learning form images/text here: http://research.microsoft.com/pubs/158806/CriminisiForests_FoundTrends_2011.pdf
We should be able to get big speed gains in parallel use by reducing the size of stored data and forests.
Density estimating trees should be easy to implement in our framework and could be used for balancing regression problems etc. It would also be interesting to pursue extension to joint...
Roughly balanced bagging has proven great for unbalanced classification problems. We should develop a version for regression that samples in bag cases to ensure roughly uniform density across the range...
If evaloob proves to boost performance an optimized function that splits from the coded split and also includes the corrections for missing etc should be made.