explainerdashboard
explainerdashboard copied to clipboard
String categorical values from Lightgbm
Hello, great tool and library! Wonder if you can point me in the right direction to solve an issue?
- with LightGbm we can list the categorical column and it is ok if the values are strings values
- once in ExplainerDashboard the code crashes as it cannot handle strings values (at leat my understanding)
what can be a solution to this case? thanks
Hi @Guidosalimbeni,
So the if the model is able to handle categorical values then ExplainerDashboard should handle it as well. It does at least for CatBoost, so I assume it should work for lightgbm as well.
Do you have some runnable example code that shows the crash or wrong output?
Hi @Guidosalimbeni,
would you able to provide any examples of where this broke?
Hello, I am running into this issue. Here is the error message that I get:
TypeError: '<' not supported between instances of 'float' and 'str'
And here is a reproducible example:
from lightgbm import LGBMClassifier
from sklearn.model_selection import train_test_split
from explainerdashboard import ClassifierExplainer, ExplainerDashboard
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
df[df.select_dtypes("O").columns] = df.select_dtypes("O").astype("category")
df = df[["Survived", "Age", "Sex", "Embarked"]]
y = df.pop("Survived")
X = df
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LGBMClassifier()
model.fit(X_train, y_train, feature_name='auto', categorical_feature='auto')
explainer = ClassifierExplainer(
model, X_test, y_test,
labels=['Not survived', 'Survived'])
db = ExplainerDashboard(explainer, title="Titanic Explainer",
whatif=False,
shap_interaction=False,
decision_trees=False)
db.run(port=8051)
Hello, I am running into this issue. Here is the error message that I get:
TypeError: '<' not supported between instances of 'float' and 'str'
And here is a reproducible example:
from lightgbm import LGBMClassifier from sklearn.model_selection import train_test_split from explainerdashboard import ClassifierExplainer, ExplainerDashboard import pandas as pd df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv") df[df.select_dtypes("O").columns] = df.select_dtypes("O").astype("category") df = df[["Survived", "Age", "Sex", "Embarked"]] y = df.pop("Survived") X = df X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) model = LGBMClassifier() model.fit(X_train, y_train, feature_name='auto', categorical_feature='auto') explainer = ClassifierExplainer( model, X_test, y_test, labels=['Not survived', 'Survived']) db = ExplainerDashboard(explainer, title="Titanic Explainer", whatif=False, shap_interaction=False, decision_trees=False) db.run(port=8051)
I am having the same problem, is there a solution?
Yes great, I am still having the same issue.
Hi,
I have the same issue due to string values in the data, I'd like to create dashboard, as a fitted model used TabularPredictor from Autogluon library is there any solution or update related to this issue?
I think this issue is more related to the data and how LightGBM is coded.
I stumble upon this error, but it was an error from LightGBM, not explainerdashboard.
Try the following:
df.columns = df.columns.str.translate("".maketrans({"[":"{", "]":"}","<":"^"}))
df.columns[df.columns.str.contains("[\[\]<]")]
This is making sure it removes and targets the error:
TypeError: '<' not supported between instances of 'float' and 'str'
Do let me know if that solves your issue.