DALEX
DALEX copied to clipboard
Add support for Multiregression tasks
I tried the python version of dalex with a multiregression model and it gave an error. (See below) Is there any way around it ? If i understand correctly iBreakdown/pyBreakdown can deal with multiple classes for classification which are also probabilities organized in multiple columns/arrays so this should be quite similar. Would be great if this would be enabled. The SHAP package also supports Shap values for the multirgression case.
Can i call ibreakdown directly from dalex, without generating an explainer object ? The ibreakdown for Python has not been updated in a while but the new Python Dalex seems quite active.
decision tree for multioutput regression
import dalex as dx from sklearn.datasets import make_regression from sklearn.tree import DecisionTreeRegressor
create datasets
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=2, random_state=1, noise=0.5)
define model
model = DecisionTreeRegressor()
model.fit(X,y)
dx.Explainer(model,X,y)
data is converted to pd.DataFrame, columns are set as string numbers -> data : 1000 rows 10 cols Traceback (most recent call last):
File "
File "C:\Users\Thomas Wolf\anaconda3\envs\my-rdkit-env\lib\site-packages\dalex_explainer\object.py", line 131, in init y = check_y(y, data, verbose)
File "C:\Users\Thomas Wolf\anaconda3\envs\my-rdkit-env\lib\site-packages\dalex_explainer\checks.py", line 52, in check_y raise ValueError("y must have only one dimension")
ValueError: y must have only one dimension
We don't support multi-output models yet. You can adjust the predict_function
to produce iBreakDown
plots for a given class.
import dalex as dx
from sklearn.datasets import make_regression
from sklearn.tree import DecisionTreeRegressor
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=2, random_state=1, noise=0.5)
model = DecisionTreeRegressor()
model.fit(X,y)
exp_0 = dx.Explainer(model, X, y[:, 0], predict_function = lambda m, d: m.predict(d)[:, 0], label="output 0")
exp_1 = dx.Explainer(model, X, y[:, 1], predict_function = lambda m, d: m.predict(d)[:, 1], label="output 1")
exp_0.predict_parts(X[2, :]).plot(exp_1.predict_parts(X[2, :]))
y[2, :]
Example https://dalex.drwhy.ai/python-dalex-multioutput added in https://github.com/ModelOriented/DALEX-docs/commit/47378806fa9d32b612fb84adb46640638956ede7.
Hi @hbaniecki this would be nice
But it would important to consider the MultiOutput wrapper of scikit learn.
Currently I am creating explainers for every target and then adding them to the plots:
(every line represents a model)