scorecard icon indicating copy to clipboard operation
scorecard copied to clipboard

Monotonic-WOE-Binning-Algorithm

Open Leo-Lee15 opened this issue 6 years ago • 22 comments

Hello,

I just discover a Github repo, jstephenj14/Monotonic-WOE-Binning-Algorithm, which provides a Python implementation of a variable binning algorithm that optimizes information value (IV) monotonicity and representativeness.

I think it would be great to include this algorithm is your fantastic package scorecard. Since the author provides the Python version, I wonder if it could be incorporated into you scorecard R package.

Thanks!

Leo-Lee15 avatar Sep 15 '18 14:09 Leo-Lee15

Thank you for your suggestion. I will read the repo and the referenced article. If it is reasonable, I will add it into the package. This might take some time.

According to my experience, some variables wouldn't be monotonic after woe binning. For example, the default rate at different hours in a day, always peak at midnight and afternoon.

ShichenXie avatar Sep 16 '18 13:09 ShichenXie

Yes, it is too difficult to get a monotonic result for some variables. But at least, this algorithm provides a way to achieve the desired results less troublesome.

Anyway, thanks for your effort to this nice package!

Leo-Lee15 avatar Sep 18 '18 16:09 Leo-Lee15

I use 'woebin' to bin the variables with my data. The error 'you are trying to merge an object and float64 columns. If you wish to proceed you should use pd.concat' appeared. I compared my variable type with yours, there existed 'object' type in your data too. But why using your data are there no error and my data error? Do you have any suggestions for me?

monicamn avatar Nov 27 '18 07:11 monicamn

I use 'woebin' to bin the variables with my data. The error 'you are trying to merge an object and float64 columns. If you wish to proceed you should use pd.concat' appeared. I compared my variable type with yours, there existed 'object' type in your data too. But why using your data are there no error and my data error? Do you have any suggestions for me?

You are using python version package? Please open an issue in scorecardpy repo and provide a reproducible example.

ShichenXie avatar Nov 27 '18 07:11 ShichenXie

Thank you for your answer and i use python 3.7 to run the code. The error is as follows:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2961, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 4, in positive="bad|1", no_cores=None, print_step=1, method="tree") File "C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy-0.1.7.1-py3.7.egg\scorecardpy\woebin.py", line 877, in woebin bins = dict(zip(xs, pool.starmap(woebin2, args))) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 657, in get raise self._value ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat

monicamn avatar Nov 27 '18 08:11 monicamn

你到scorecardpy那个项目新建一个issue吧。然后给一个可重现的例子,不然我没法知道你碰到了啥问题。

ShichenXie avatar Nov 27 '18 09:11 ShichenXie

hi,shichenxie, 在woebin.py中binning_tree变量没有初始化,有时会报错,加上“binning_tree = None”可以解决问题

default

6yuan789 avatar Dec 20 '18 10:12 6yuan789

我看看,这个问题

ShichenXie avatar Dec 24 '18 01:12 ShichenXie

Dear ShichenXie 我在运行woebin函数时弹出错误,提示没有"data.table"函数,但我后面library(data.table)后还是如此提示,不知道什么原因。如下图: image

wgx711 avatar Mar 05 '19 08:03 wgx711

我在运行woebin函数时弹出错误,提示没有"data.table"函数,但我后面library(data.table)后还是如此提示,不知道什么原因。如下图:

重启一下R,再试试看。如果你在windows环境下,确认是否安装了rtools。

ShichenXie avatar Mar 05 '19 08:03 ShichenXie

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

wgx711 avatar Mar 05 '19 09:03 wgx711

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

ShichenXie avatar Mar 05 '19 09:03 ShichenXie

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

谢谢。我去了解下,但目前的情况是bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")一直到bins_width = woebin(germancredit, y="creditability", x=numeric_cols, method="width") 都能正常运行,就是bins_germ = woebin(germancredit, y = "creditability") 运行弹出那个错误。。。

你建议我先卸载scorecard这个包,然后再从github装最新的吗?

wgx711 avatar Mar 05 '19 11:03 wgx711

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

我在另一个电脑上,R的版本是3.4.2.可以正常运行所有的函数。。。也不晓得是什么原因

wgx711 avatar Mar 05 '19 13:03 wgx711

我在另一个电脑上,R的版本是3.4.2.可以正常运行所有的函数。。。也不晓得是什么原因

  1. 如果你没安装过rtools,那就是这个原因,安装下就解决了
  2. 如果R是从3.5之前升级到目前的3.5.2,那么需要重新安装所有包 如果还没解决,我也没办法了

ShichenXie avatar Mar 06 '19 01:03 ShichenXie

后面的朋友别在这个issue里面提问题了啊。有问题重新开一个new issue。这个issue是因为一直还没解决所以没有关闭。

ShichenXie avatar Mar 06 '19 01:03 ShichenXie

This Github Repo by Wensui Liu also has some MonotonicBinning implementations in R.

ddzr avatar Jul 19 '19 19:07 ddzr

我也希望scorecard包加入单调分箱的功能作为 分bin的选项

longhua8800w avatar Jun 03 '20 09:06 longhua8800w

If WOEBIN doesn't return monotonic bins, does that compromise the interpretability of the WOE/IV values? Is it up the user to rebin?

shlid007 avatar Oct 22 '23 18:10 shlid007

单调分箱的功能什么时候能加入啊?

Blanket58 avatar Mar 26 '24 09:03 Blanket58

If WOEBIN doesn't return monotonic bins, does that compromise the interpretability of the WOE/IV values? Is it up the user to rebin?

The woebin_adj function provides an interface to adjust the binning results manually.

ShichenXie avatar Mar 26 '24 11:03 ShichenXie

单调分箱的功能什么时候能加入啊?

等回头我再研究研究啊

ShichenXie avatar Mar 26 '24 11:03 ShichenXie