boruta_py icon indicating copy to clipboard operation
boruta_py copied to clipboard

Implements sample_weight and optional permutation and SHAP importance, categorical features, boxplot

Open ThomasBury opened this issue 3 years ago • 6 comments

Hi,

It took me a while but finally found the time to work on the continuation of the discussion https://github.com/scikit-learn-contrib/boruta_py/pull/77

Meaning:

  • Not introducing new dependencies, a check import is performed if the User wants to use SHAP or get the matplotlib boxplot
  • Permutation importance (sklearn) is also implemented but optional and easy to switch off
  • Categorical features are encoded if any (optional)
  • sample_weight can now be passed to the fit method
  • A notebook illustrates the changes and compares the original Boruta_py and the new features
  • Add a note in the readme

ThomasBury avatar Aug 20 '21 15:08 ThomasBury

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

danielhomola avatar Aug 31 '21 08:08 danielhomola

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

Sorry, they were remains of the previous ignore, .idea and checkpoints are removed. I hope the notebook will be helpful, do not hesitate if you have any questions/remarks.

Thanks

ThomasBury avatar Aug 31 '21 12:08 ThomasBury

This seems like a really cool PR! Is there any chance that it will get merged soon?

erikvdp avatar Nov 22 '21 10:11 erikvdp

This seems like a really cool PR! Is there any chance that it will get merged soon?

Thanks @erikvdp, meanwhile, you might have a look at https://github.com/ThomasBury/arfs implementing those and more (although I still think it'd best to integrate the changes related to boruta in the official boruta_py ^^)

ThomasBury avatar Nov 22 '21 11:11 ThomasBury

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

MauritsDescamps avatar Jan 03 '23 16:01 MauritsDescamps

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

Hi @MauritsDescamps, I built the ARFS package to provide those features for Boruta (and much more). In the ARFS pkg, you'll find 3 different methods for performing all relevant feature selection. I called the evolution of Boruta: "Leshy" and it provides the features of this PR. There are notebooks that explain step by step how to use it and what are the differences.

you can test it by simply pip install -U arfs, there is a brand new release (version 1.0.2)

ThomasBury avatar Jan 09 '23 15:01 ThomasBury