feature_engine icon indicating copy to clipboard operation
feature_engine copied to clipboard

Information Value for nominal variables

Open SurajitTest opened this issue 3 years ago • 8 comments

Presently, there are no packages in python to calculate Information Value using WOE for nominal/categorical variables. As WOE Encoder is already available in Feature Engine, hence I am raising the request for a feature to obtain information value for nominal variables

SurajitTest avatar Jun 26 '21 07:06 SurajitTest

could you add a few links with information about WoE and IV?

solegalli avatar Jun 26 '21 15:06 solegalli

https://www.listendata.com/2015/03/weight-of-evidence-woe-and-information.html

https://www.listendata.com/2019/08/WOE-IV-Continuous-Dependent.html

http://ucanalytics.com/blogs/information-value-and-weight-of-evidencebanking-case/

SurajitTest avatar Jun 27 '21 10:06 SurajitTest

Hi Sole, I am interested to work on this, but I dont not much idea about open source contribution. I have recently watched all your ML courses on Udemy and I really liked them, thats why I am inclined to take a shot at this. Can you guide me a little bit here.

saurabhgoel1985 avatar Aug 12 '22 11:08 saurabhgoel1985

Welcome @saurabhgoel1985

Absolutely! I look forward to working with you.

First of all, it's been a while since I read those articles on IV, so I guess, the first thing would be to go over them and maybe add a few bullet points below on what the new transformer should be doing.

From the top of my head, the IV is calculated based on the WoE, and with the IV, we can select features.

The new transformer should go in a new python script with a meaningful name, inside the folder feature_engine/selection

I suggest you take a look at the WoEEncoder class that lives in the feature_engine/encoding folder, and also take a look at another selection class from the selection folder. For example the SelectByShuffling.

The best would be to kind of copy one of those classes into the new script, and edit the content as needed. Because the new class needs the WoE to calculate the IV, it might be a good idea that it inherits the WoE, but I am not sure. Feel free to explore if there is a better solution.

Have fun! And let me know if you need anything!

solegalli avatar Aug 13 '22 11:08 solegalli

Correction: the transformer is for feature selection, so I edited the former comment.

solegalli avatar Aug 13 '22 11:08 solegalli

Hey everyone, I apologize I forgot to write that PR #488 closes this PR. The PR is mentioned above.

I’ve made some decent progress. I’m on vacation. I will continue to work on the PR when I return.

i believe that @solegalli has provided feedback that I need to incorporate.

Morgan-Sell avatar Aug 14 '22 20:08 Morgan-Sell

Hey everyone, I apologize I forgot to write that PR #488 closes this PR. The PR is mentioned above.

I’ve made some decent progress. I’m on vacation. I will continue to work on the PR when I return.

i believe that @solegalli has provided feedback that I need to incorporate.

Thanks for the update Morgan, hope the PR gets merged soon

saurabhgoel1985 avatar Aug 15 '22 08:08 saurabhgoel1985

Sorry @Morgan-Sell I 've not reviewed that PR yet. You make so many PRs that I can't keep up :p

I'll review the 2 remaining PRs on Wednesday.

Cheers

solegalli avatar Aug 15 '22 13:08 solegalli