onnxmltools
onnxmltools copied to clipboard
How to solve Feature name error while converting an XGBClassifier model to ONNX?
I trained an XGBClassifier model, and now I want to convert it to an ONNX format. It should be straightforward forward using this code:
import onnxmltools
from skl2onnx.common.data_types import FloatTensorType
initial_types = [('float_input', FloatTensorType([None, X_train.shape[1]]))]
xgb_onnx = onnxmltools.convert_xgboost(xgb.xgb_category_cls, initial_types=initial_types)
onnxmltools.utils.save_model(xgb_onnx , 'xgb_onnx .onnx')
However, I get this error which is related to one of my features name:
77 feature_id = int(float(feature_id))
78 except ValueError:
---> 79 raise RuntimeError(
80 "Unable to interpret '{0}', feature "
81 "names should follow pattern 'f%d'.".format(
RuntimeError: Unable to interpret 'state', feature names should follow pattern 'f%d'.
I am not sure what I did wrong.
I came across the same issue. The converter expects no feature names or one that are "0", "1", ... or "f0", "f1", "f3".
You can work around this issue by renaming the features like this:
booster = model.get_booster()
original_feature_names = booster.feature_names
if original_feature_names is not None:
onnx_converter_conform_feature_names = [f"f{num}" for num in range(len(original_feature_names))]
booster.feature_names = onnx_converter_conform_feature_names
But be careful, as you overwrite the original booster of the model, which means that from now on, the feature names of the xgboost model are now changed and using validate_features=True
in the model.predict
method with the original dataset may fail.
If you throw away the model after the onnx conversion, you are fine.
Otherwise, I would suggest doing a deep copy of the model, e.g. save + load.
Do you have an example I could use to replicate the issue?
Here is a minimal example with which I can reproduce this error (xgboost version 1.7.5):
# %%
from onnxmltools import convert_xgboost
from skl2onnx.common.data_types import FloatTensorType
from xgboost.sklearn import XGBClassifier
import pandas
import numpy as np
num_columns = 5
num_rows = 20
seed = 42
np.random.seed(seed)
X = np.random.random_sample((num_rows, num_columns))
y = np.random.randint(low=0, high=2, size=num_rows)
y = pandas.Series(y)
X = pandas.DataFrame(X)
columns = [f"abc_{num}" for num in range(num_columns)]
X.columns = columns
model = XGBClassifier(random_state=seed)
model = model.fit(X, y)
initial_type = [('float_input', FloatTensorType([None, 5]))]
convert_xgboost(model=model, initial_types=initial_type, target_opset=14)
I had the same issue. I resolved it by renaming the features from f0-fn (where n is the number of features - 1). The problem occurs when a number is skipped (f0, f1, f2, f4), or the features names don't start from f0.
Do you have an example I could use to replicate the issue?
I had the same issue. I resolved it by renaming the features from f0-fn (where n is the number of features - 1). The problem occurs when a number is skipped (f0, f1, f2, f4), or the features names don't start from f0.