lime icon indicating copy to clipboard operation
lime copied to clipboard

How to explain prediction for a data with just a few features (from all features of training dataset)?

Open williamty opened this issue 1 year ago • 2 comments

I have generated lightGBM models for prediction. I can explain the predictions with all features by filling user input data with NAs. Is there any way to explain prediction for the original user input data without filling it?

williamty avatar Oct 22 '23 07:10 williamty

You can run the LIME Explainer on select few columns/features too. Change the LIME explainer from classification to regression if your model is used for regression.

// write logic to select the features that you want to run LIME Explainer on // for example selected_data = data[['col1', 'col2']]

// Create a LimeTabularExplainer explainer = LimeTabularExplainer(selected_data.values, feature_names=selected_data.columns.values, mode="classification")

// select instance to explain data_row = selected_data.iloc[0] # Get the first row in the selected data // the num_features helps us select the features that we want to predict explanation = explainer.explain_instance(data_row, lgbm_model.predict, num_features=len(selected_data.columns))

// Display the explanation explanation.show_in_notebook()

The above method works for explaining the predictions when you want to have selective features. But if you want to generate predictions, your testing data (x_test) has to have the same features as the training data (x_train) in the ML model, or else it'll throw error of features not being the same.

apoplexi24 avatar Oct 25 '23 14:10 apoplexi24

@apoplexi24 Thank you for your kind reply!! It worked! By the way, I have also changed the code of predict function, setting the 'predict_disable_shape_check' parameter to true: ` def predict_fn(x): if len(np.array(x).shape) == 1: # Reshape individual data points to 2D return ldl.predict(np.array(x).reshape(1, -1), predict_disable_shape_check=True) else: # Predict for the entire dataset return ldl.predict(x, predict_disable_shape_check=True)

def predict_fn_binary(x): return np.column_stack((1 - predict_fn(x), predict_fn(x))) `

williamty avatar Oct 26 '23 14:10 williamty