probatus
probatus copied to clipboard
ShapRFECV min number of features to keep per feature group
Problem Description There could be a grouping of features e.g. demographic, siocial etc. In that case you might want to end up with a combination of features from each group, at the end of the feature elimination process.
Desired Outcome We can add two parameters:
- feature_grouping - groups to which each feature belongs to
- min_num_features_per_group - minimum number of features in each group at any point in the feature elimination process.
Thanks to this, even if one type of features is not predictive, at least a couple of them will be kept. This way the model can be explained better to the business in terms of different characteristics of the sample.
The more parameters we add to a class, the more intimidating it becomes for a new user to learn it. I prefer a simple knife as a tool. My point:
If a user really wants this, they could write some code that runs ShapRFECV
on the different groups, then combine the best n
features from each group, then run it one more time on the combined featureset, and select your final featureset based on the results.
What I'm getting at is also called the Unix philosophy:
Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new "features".
So, even though I see the added value of the feature, I would advocate against it.
I agree with the fact that this would complicate the API. That is why a way to implement it could be make a class that inherits from ShapRFECV
e.g. GroupedShapRFECV
. This way there would be a very specific use case for using the other class. The main difference between the two would be extra parameters in the init and different way of selecting which features to remove at a given iteration.
What do you think about this approach? Does the use case argument implementation of an extra class just for this puprose?
I had a similar idea of tackling the issue, of just running ShapRFECV separately for each group. The main drawback of that we don't take into consideration the predictive power of relations between features in different groups.
A separate class for this makes sense. However, does the problem of having to keep at least one feature for every group occur often enough to outweigh the costs of having to maintain more code? (I'm trying to push for keeping things as simple as possible ;) )
After offline discussion, I understand there are teams that could benefit from this use-case. Separating it from ShapRFECV
, I think we can manage to keep the most common use-case as simple as possible.