RandomerForest
RandomerForest copied to clipboard
why better?
if i recall, RerF is 10% better than RF on about 10% of the data?
let's compute, for each dataset:
- n
- p
- p/n
- sum of singular values
- sum of squared singular values
let's make a pairs-plot, 5 x 5 panels, color code by much better than RF (eg >7% or so), and not. and see if we can see anything?
in the PAMI paper, we really should try to answer the questions:
- which features/subspaces were informative
- why does RerF > RF (in terms of bias and variance)
- what properties of data do we expect RerF > RF