TE2Rules icon indicating copy to clipboard operation
TE2Rules copied to clipboard

compatibility with xgboost 2.0.3

Open qingyuanxingsi opened this issue 1 year ago • 2 comments

Using xgboost 2.0.3, I found the following error: (with categorical support) model_explainer = ModelExplainer( File "/mllab/miniconda3/envs/llm-3.9/lib/python3.9/site-packages/te2rules/explainer.py", line 110, in init self.random_forest = XgboostXGBClassifierAdapter( File "/mllab/miniconda3/envs/llm-3.9/lib/python3.9/site-packages/te2rules/adapter.py", line 254, in init self.random_forest = self._convert() File "/mllab/miniconda3/envs/llm-3.9/lib/python3.9/site-packages/te2rules/adapter.py", line 290, in _convert node = self._build_tree(tree_dict) File "/mllab/miniconda3/envs/llm-3.9/lib/python3.9/site-packages/te2rules/adapter.py", line 266, in _build_tree i = int(tree_dict["split"][1:]) ValueError: invalid literal for int() with base 10

qingyuanxingsi avatar Jul 25 '24 12:07 qingyuanxingsi

Can you give some more details about your use case:

  • Is your XGBoost model trained for Binary Classification?
  • Does the notebook in the README work for you with your version of XGBoost?

groshanlal avatar Aug 04 '24 19:08 groshanlal

I'm getting the same issue with xgboost (any version). The issue is here in te2rules:

        if "leaf" in tree_dict:
            node = DecisionTree(LeafNode(value=float(tree_dict["leaf"])))
        else:
            # Get feature index. Ex: feature-21 would be
            # represented as f21. Get index 21 from f21.
            i = int(tree_dict["split"][1:])

It looks like its expecting feature names as fxx. However, the split feature name is not in that format. Setting breakpoint, here is the feature name format for my case:

tree_dict["split"]
Out[2]: 'close_15m'

Update: I think xgboost needs to be trained with arrays and not pandas dataframes. Testing now and will report back

Answer: Converting data set to numpy worked

jmrichardson avatar Feb 04 '25 16:02 jmrichardson