mljar-supervised
mljar-supervised copied to clipboard
No Shap outputs
Hi, I'm not seeing any shap outputs when using the following:
# Initialize AutoML in Explain Mode
automl = AutoML(mode="Explain",
explain_level=2,
ml_task='multiclass_classification')
automl.fit(X, y)
This in spte of shap being properly installed. What I get out of the above code is the following:
AutoML directory: AutoML_7
The task is multiclass_classification with evaluation metric logloss
AutoML will use algorithms: ['Baseline', 'Linear', 'Decision Tree', 'Random Forest', 'Neural Network']
AutoML will ensemble available models
AutoML steps: ['simple_algorithms', 'default_algorithms', 'ensemble']
* Step simple_algorithms will try to check up to 3 models
1_Baseline logloss 3.229533 trained in 25.56 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
2_DecisionTree logloss 2.15877 trained in 59.34 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
3_Linear logloss 1.707406 trained in 47.68 seconds
* Step default_algorithms will try to check up to 2 models
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
4_Default_NeuralNetwork logloss 4.045366 trained in 7.02 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
5_Default_RandomForest logloss 1.858415 trained in 75.39 seconds
* Step ensemble will try to check up to 1 model
Ensemble logloss 1.288517 trained in 0.56 seconds
AutoML fit time: 226.47 seconds
AutoML best model: Ensemble
AutoML(explain_level=2, ml_task='multiclass_classification')
Thanks @dbrami for reporting. Is it possible to include data to reproduce the issue?
Sure. Uploading my ipynb and data Archive.zip
Hi Pavel, Any luck?
It happened to me too @dbrami but with tree visualizations, with the same explain_level value, let me know if you find something
it happened to me too. also with tree visualizations. i think maybe it's related to the mission, tree visualizations are not suitable for binary classification.
Hi @williamty, please make sure that you have the latest version of package pip install -U mljar-supervised, decision trees should be produced. Regarding missing SHAP plots - it might be a bug.
I have the same issue with the latest version - that no SHAP values are produced. Is there a previous stable version w.r.t. this feature?
Maybe there were some changes in shap API?
I think the issue is that the current implementation does not accept object/category/string types and everything needs to numeric fro SHAP to work, which kind of defeats the objective of AutoML should one use SHAP to guide feature selection ...