Auto-PyTorch icon indicating copy to clipboard operation
Auto-PyTorch copied to clipboard

Subsampling vs feature selection?

Open nabenabe0928 opened this issue 2 years ago • 1 comments

Since we often assume the manifold hypothesis or low-intrinsic dimensionality in most cases, I think it is better to reduce the dataset size by feature selection, i.e. the reduction over the column size, rather than subsampling, i.e. the reduction over the row size.

What do @ravinkohli think?

nabenabe0928 avatar Mar 10 '22 13:03 nabenabe0928

For reference, autogluon has implemented feature pruning here

ravinkohli avatar Mar 11 '22 13:03 ravinkohli