sklearn-onnx icon indicating copy to clipboard operation
sklearn-onnx copied to clipboard

Support for target based encoders

Open siddharth98765 opened this issue 4 years ago • 3 comments

I was trying to implement a Target based encoder for which I was facing difficulty while implementing a converter

class (CategoricalTransformerOnnx(BaseEstimator, util.TransformerWithTargetMixin): def init(self, cols=None, a=1): self.cols = cols self.a = a self.use_default_cols = cols is None

def fit(self, X, y,**kwargs):
    if self.use_default_cols:
        self.cols = util.get_obj_cols(X)
    else:
        self.cols = util.convert_cols_to_list(self.cols)

    print(self.cols)
    X_temp = self.transform(X, y)
    return X_temp
 
#Transformer method we wrote for this transformer 
def transform(self, X , y ):
    for column in self.cols:
        global_mean = y.mean()
        temp = y.groupby(X[column].astype(str)).agg(['cumsum', 'cumcount'])
        X[column] = (temp['cumsum'] - y + global_mean) / (temp['cumcount'] + self.a)           
    return X

def cat_shape_calculator(operator):

input = operator.inputs[0]      
N = input.type.shape[0]         # number of observations
C = 22   # dimension of outputs

# new output definition
#operator.inputs[0].type = FloatTensorType([N, C])
operator.outputs[0].type = FloatTensorType([N, C])  

def to_onnx_converter(scope, operator, container): output = operator.outputs[0] # output in ONNX graph op = operator.raw_operator

print(op)      

name = scope.get_unique_operator_name('CategoricalTransformerOnnx')
attrs = {'name': scope.get_unique_operator_name('CategoricalTransformerOnnx')}
#attrs = {}
container.add_node('CategoricalEncoder',operator.input_full_names,
                   operator.output_full_names, op_domain='ai.onnx.ml',
                   **attrs)

update_registered_converter(CategoricalTransformerOnnx, 'CustomTransformer', cat_shape_calculator, to_onnx_converter)

Following error InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:CategoricalTransformerOnnx1 : No Op registered for CategoricalEncoder with domain_version of 1 I'm new to onnx platform, so there might some flaw in my code

siddharth98765 avatar Jun 13 '20 12:06 siddharth98765

ONNX only supports a specific set of operators which you can found at: https://github.com/onnx/onnx/blob/master/docs/Operators.md and https://github.com/onnx/onnx/blob/master/docs/Operators-ml.md. The final graph is a combination of them in a graph. You can look into example: https://github.com/onnx/sklearn-onnx/blob/master/docs/examples/plot_custom_model.py.

xadupre avatar Jun 15 '20 12:06 xadupre

Do we have to implement the entire custom transformer using onnx operators in the converter, from the example it seems like it. Further, how to give multiple inputs to the operator such as X, y where some transformation has to done on columns of X based on the values of y, here X is a dataframe

siddharth98765 avatar Jun 16 '20 06:06 siddharth98765

If you need a complete new converter, you need to define three elemens : a parser which defines the number of input and output of your model, the shape calculator which defines the size of every output, the converter which converter the model into ONNX. You can find a comple example here: https://github.com/onnx/sklearn-onnx/blob/master/docs/examples/plot_custom_parser.py.

xadupre avatar Jun 16 '20 12:06 xadupre