pega-datascientist-tools
pega-datascientist-tools copied to clipboard
Calculation of Feature Importance incorrect
pdstools version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pdstools.
Issue description
The Feature Importance for NB models calculated by PDS tools isn't the same as in platform The R version has a subtle issue not using the right laplace smoothing (1 rather than 1/#bins) The Python version seems totally off, not calculating the diff from the mean and not scaling Platform suffers from same issues as python implementation, tracking this under BUG-880410
Reproducible example
See Excel sheet for analysis
Expected behavior
All versions should give the exact same results
Installed versions
n/a, issues have been around for a while
@operdeck can we squeeze in a fix for #260 for this? Or do we not have a fix yet
Lets park this one for a little, I am not certain on the solution. Explored a lot of things, then found that many of the variations are (very) strongly correlated, even with univariate AUC. So made this much lower prio for myself. Will pick up post v4 release, still valid, but not urgent.