sklearn-onnx
sklearn-onnx copied to clipboard
Support for target based encoders
I was trying to implement a Target based encoder for which I was facing difficulty while implementing a converter
class (CategoricalTransformerOnnx(BaseEstimator, util.TransformerWithTargetMixin): def init(self, cols=None, a=1): self.cols = cols self.a = a self.use_default_cols = cols is None
def fit(self, X, y,**kwargs):
if self.use_default_cols:
self.cols = util.get_obj_cols(X)
else:
self.cols = util.convert_cols_to_list(self.cols)
print(self.cols)
X_temp = self.transform(X, y)
return X_temp
#Transformer method we wrote for this transformer
def transform(self, X , y ):
for column in self.cols:
global_mean = y.mean()
temp = y.groupby(X[column].astype(str)).agg(['cumsum', 'cumcount'])
X[column] = (temp['cumsum'] - y + global_mean) / (temp['cumcount'] + self.a)
return X
def cat_shape_calculator(operator):
input = operator.inputs[0]
N = input.type.shape[0] # number of observations
C = 22 # dimension of outputs
# new output definition
#operator.inputs[0].type = FloatTensorType([N, C])
operator.outputs[0].type = FloatTensorType([N, C])
def to_onnx_converter(scope, operator, container): output = operator.outputs[0] # output in ONNX graph op = operator.raw_operator
print(op)
name = scope.get_unique_operator_name('CategoricalTransformerOnnx')
attrs = {'name': scope.get_unique_operator_name('CategoricalTransformerOnnx')}
#attrs = {}
container.add_node('CategoricalEncoder',operator.input_full_names,
operator.output_full_names, op_domain='ai.onnx.ml',
**attrs)
update_registered_converter(CategoricalTransformerOnnx, 'CustomTransformer', cat_shape_calculator, to_onnx_converter)
Following error InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:CategoricalTransformerOnnx1 : No Op registered for CategoricalEncoder with domain_version of 1 I'm new to onnx platform, so there might some flaw in my code
ONNX only supports a specific set of operators which you can found at: https://github.com/onnx/onnx/blob/master/docs/Operators.md and https://github.com/onnx/onnx/blob/master/docs/Operators-ml.md. The final graph is a combination of them in a graph. You can look into example: https://github.com/onnx/sklearn-onnx/blob/master/docs/examples/plot_custom_model.py.
Do we have to implement the entire custom transformer using onnx operators in the converter, from the example it seems like it. Further, how to give multiple inputs to the operator such as X, y where some transformation has to done on columns of X based on the values of y, here X is a dataframe
If you need a complete new converter, you need to define three elemens : a parser which defines the number of input and output of your model, the shape calculator which defines the size of every output, the converter which converter the model into ONNX. You can find a comple example here: https://github.com/onnx/sklearn-onnx/blob/master/docs/examples/plot_custom_parser.py.