eli5
eli5 copied to clipboard
Explaining LSTM keras with Eli5 library
Hi, I'm trying to use Eli5 for explaining an LSTM keras model for time series prediction. The keras model receives as input an array with shape (nsamples, timesteps, nfeatures).
This is my code:
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
def baseline_model():
model = Sequential()
model.add(LSTM(32, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='logcosh', optimizer='adam')
return model
my_model = KerasRegressor(build_fn= baseline_model, nb_epoch= 30, batch_size= 32, verbose= False)
history = my_model.fit(X_train, y_train)
So far, everything is ok. The problem is when I execute the following line that launchs an error:
Note: X_train has a shape equal to (nsamples, timesteps, nfeatures) and y_train has a shape (nsamples)
perm = PermutationImportance(my_model, random_state=1).fit(X_train, y_train)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-75-c9cc23da0083> in <module>()
2 d2_train_dataset = X_train.reshape((nsamples, timesteps * features))
3
----> 4 perm = PermutationImportance(my_model, random_state=1).fit(X_train, y_train)
5 #eli5.show_weights(perm, feature_names = X.columns.tolist())
~/anaconda3/lib/python3.6/site-packages/eli5/sklearn/permutation_importance.py in fit(self, X, y, groups, **fit_params)
183 self.estimator_.fit(X, y, **fit_params)
184
--> 185 X = check_array(X)
186
187 if self.cv not in (None, "prefit"):
~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
568 if not allow_nd and array.ndim >= 3:
569 raise ValueError("Found array with dim %d. %s expected <= 2."
--> 570 % (array.ndim, estimator_name))
571 if force_all_finite:
572 _assert_all_finite(array,
ValueError: Found array with dim 3. Estimator expected <= 2.
What can I do to fix this error? How can I use eli5 with my LSTM Keras Model?
Bets regards
I get the same error. Any luck with this?
@ogreyesp Hey! I think the problem here is that Sci-kit learn expects 2d num arrays for the training dataset for a fit function and the dataset which you are trying to pass here is in the shape of a 3d array. Please try reshaping it to a 2d array.
I'm running into the same issue. When I reshape training dataset to 2D I will then get an error because my model is expecting a 3D input.
`~\Anaconda3\envs\keras-gpu\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 125 ': expected ' + names[i] + ' to have ' + 126 str(len(shape)) + ' dimensions, but got array ' --> 127 'with shape ' + str(data_shape)) 128 if not check_batch_axis: 129 data_shape = data_shape[1:]
ValueError: Error when checking input: expected lstm_5_input to have 3 dimensions, but got array with shape (12, 324)`
I get the same error. Any luck with this?
@ogreyesp Hey! I think the problem here is that Sci-kit learn expects 2d num arrays for the training dataset for a fit function and the dataset which you are trying to pass here is in the shape of a 3d array. Please try reshaping it to a 2d array.
If I am not mistaken, LSTM layers require 3d array, so I don't think eli5 can explain LSTM
I discovered another library SHAP
which allows you to analyze LSTM and other 3d-array models, and I finally managed to get the feature importance for my LSTM model. For more details, check out my answer here.
@jarrettyeo, I just have a question X_test in both DE.shape_values and summary_plot should have the same shape or they have to have different shape? import shap
tf.compat.v1.disable_eager_execution()
background = X_train[np.random.choice(X_train.shape[0], 100, replace=False)]
DE = shap.DeepExplainer(model, background) # X_train is 3d numpy.ndarray shap_values = DE.shap_values(X_test, check_additivity=False) # X_validate is 3d numpy.ndarray
shap.initjs() shap.summary_plot( shap_values[0], X_test, feature_names=list_column, max_display=12, plot_type='bar')
@jarrettyeo, I just have a question X_test in both DE.shape_values and summary_plot should have the same shape or they have to have different shape?
@rebeen I can't remember what shape they need to be in, but here is some of my code from my project which you can adapt:
def convert_3d_to_2d(array):
if type(array) != np.ndarray:
raise TypeError("type(array)={} != numpy.ndarray".format(type(array)))
return array.reshape(array.shape[0], array.shape[1] * array.shape[2])
DE = shap.DeepExplainer(model, X_train) # X_train is 3d array
shap_values = DE.shap_values(X_validate, check_additivity=False) # X_validate is 3d array
shap.summary_plot(
convert_3d_to_2d(shap_values[0]), # <- This is probably what you need
X_validate,
feature_names=list_columns
)
You didn't mention what problem you were facing but I am guessing it's whether shap_values[0]
is correct for summary_plot()
. If that is the case, you just need to convert shap_values[0]
from 3d to 2d using a custom function convert_3d_to_2d(shap_values[0])
because the plot does not know how to plot it in 3d. Let me know if that works.
Thank you actually I faced this problem
`TypeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/shap/plots/summary.py in summary_plot(shap_values, features, feature_names, max_display, plot_type, color, axis_color, title, alpha, show, sort, color_bar, plot_size, layered_violin_max_num_bins, class_names, class_inds, color_bar_label, auto_size_plot) 148 summary_plot( 149 proj_shap_values, features[:, sort_inds] if features is not None else None, --> 150 feature_names=feature_names[sort_inds], 151 sort=False, show=False, color_bar=False, 152 plot_size=None,
TypeError: only integer scalar arrays can be converted to a scalar index`
@rebeen Can you open an issue on Stackoverflow and link it here? Then we can avoid hijacking this eli5
thread
@jarrettyeo Thank you very much I saw your answer on Stackoverflow, also I solved the problem so now I want to run the code properly and let you know about the results Rebeen
@rebeen Please do, happy to help
@jarrettyeo thank you very much, actually, I could not show the screenshot of the results so I sent to your LinkedIn could you please let me know your opinion
I'm running into the same issue. When I reshape training dataset to 2D I will then get an error because my model is expecting a 3D input.
`~\Anaconda3\envs\keras-gpu\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 125 ': expected ' + names[i] + ' to have ' + 126 str(len(shape)) + ' dimensions, but got array ' --> 127 'with shape ' + str(data_shape)) 128 if not check_batch_axis: 129 data_shape = data_shape[1:]
ValueError: Error when checking input: expected lstm_5_input to have 3 dimensions, but got array with shape (12, 324)`
Getting the same, any updates on this?