smile icon indicating copy to clipboard operation
smile copied to clipboard

Extend FeatureRanking interface for regression tasks

Open sabbatinif opened this issue 4 years ago • 2 comments

It may be useful to have a feature ranking procedure applicable not only to classification tasks (e.g. SignalNoiseRatio and SumSquaresRatio, implementing the FeatureRanking interface), but also to regression tasks. At the moment the FeatureRanking interface only accepts integer target vectors for calculating the feature rank.

sabbatinif avatar Jul 22 '21 14:07 sabbatinif

What feature selection criteria for regression are of interest?

haifengl avatar Aug 10 '21 17:08 haifengl

I have no strong preferences about the criteria. I can suggest something similar to Python SciKit-Learn's feature_selection.f_regression. It consists of a sequential algorithm aimed at iteratively and greedly selecting the most relevant features of a dataset. It starts by training a temporary regressor on a single feature (the most correlated with respect to the output values) and it keeps repeating this operation by adding one feature at a time, always peaking the one that mostly increases the temporary regressor predictive performance. At the end of this process, features are ranked on the basis of their relevance. But any other criteria is useful for me

sabbatinif avatar Sep 06 '21 13:09 sabbatinif