cuml icon indicating copy to clipboard operation
cuml copied to clipboard

[FEA] add sklearn's "out-of-bag" and "feature importance" scores to cuML's Random Forest

Open BenWynne-Morris opened this issue 3 years ago • 9 comments

I've been impressed with the speed at which I can train a cuML random forest, which I've been able to get working with WSL2.

However, I've noticed that a couple of fairly standard random forest features appear to be missing:

  1. out-of-bag scores (https://scikit-learn.org/stable/auto_examples/ensemble/plot_ensemble_oob.html)
  2. feature importance scores (https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html)

I think the latter would be especially useful to give you a "rapid" assessment of feature importance as a precursor to exploring other candidate models.

BenWynne-Morris avatar Jan 11 '21 20:01 BenWynne-Morris

@BenWynne-Morris can you rename the feature request to something like [FEA] add sklearn's "out-of-bag" and "feature importance" scores to cuML's Random Forest? Making the title more descriptive will help us with tracking and what not.

taureandyernv avatar Jan 11 '21 21:01 taureandyernv

@BenWynne-Morris can you rename the feature request to something like [FEA] add sklearn's "out-of-bag" and "feature importance" scores to cuML's Random Forest? Making the title more descriptive will help us with tracking and what not.

No problem, thanks Taurean

BenWynne-Morris avatar Jan 11 '21 22:01 BenWynne-Morris

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

github-actions[bot] avatar Feb 16 '21 20:02 github-actions[bot]

These features would still be a useful.

BenWynne-Morris avatar Mar 19 '21 23:03 BenWynne-Morris

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar May 13 '21 04:05 github-actions[bot]

I second this feature request. feature_importances_ is a very basic and commonly used feature of the sklearn RandomForestClassifier class and something that is best implemented at the library level as it can't easily be added by the user after the fact.

maltekuehl avatar Aug 02 '21 10:08 maltekuehl

This is still an important issue

yankikalfa avatar Apr 13 '22 04:04 yankikalfa

it is an important issue worth a look.

Wulin-Tan avatar Aug 28 '22 16:08 Wulin-Tan

Is this still under evaluation?

indalaterre avatar Dec 31 '23 11:12 indalaterre