lime icon indicating copy to clipboard operation
lime copied to clipboard

reliability of Lime continuous variable discretization with poor exp score

Open SSMK-wq opened this issue 3 years ago • 0 comments

I am using a random forest classifier for binary classification with 977 records and class proportion of 77:23.

I am using Lime explainer to explain the predictions made by the model.

However, I see that my Lime exp score is only 20-40 for 80pc of my observations.

But I like the idea that Lime discretizes continuous variable into bins for model explanations. Ex: Age is divided into bins 3 bins. <30, >30 and <=78 and >78.

So, positive and negative classes have different bins. Meaning, positive class has only two bins (bin 1 and 2) and negative class has only bin 3.

So, instead of relying on LIME feature coefficients (which may not be reliable due to poor explanation score), I plan to just compute the number of times under each class, a specific bin appears and use that to plot bars. So, I take the advantage of lime discretization (for each class) but use my method to show importance of a feature.

But, Do you think poor lime exp score indicates poorly computed bins for continuous variable? Can I rely on lime computed bins of continuous variable? (Even though my exp score is only 20-40). I really like that Lime computed different bin ranges for each classes. This is such an interesting insight.

SSMK-wq avatar Mar 29 '22 16:03 SSMK-wq