onnxmltools icon indicating copy to clipboard operation
onnxmltools copied to clipboard

Onnx convert_sparkml should support handleInvalid parameter

Open mohaidoss opened this issue 1 year ago • 0 comments

Hello,

I'm trying to convert a pyspark.ml pipeline to onnx [StringIndexer, OneHotEncoder], but I get the following error.

RuntimeError: Operator pyspark_ml_feature_StringIndexerModel (type: pyspark.ml.feature.StringIndexerModel) got an input col_name with a wrong type <class 'onnxconverter_common.data_types.FloatTensorType'>. Only [<class 'onnxconverter_common.data_types.Int64TensorType'>, <class 'onnxconverter_common.data_types.StringTensorType'>] are allowed

The col_name is of type Boolean which is indeed not supported by the StringIndexer, however it shouldn't raise an error if the handleInvalid parameter is set to keep. See here

I haven't checked for other models, but I suppose same issue can raise if applicable.

Best regards, Mehdi

mohaidoss avatar Mar 03 '23 14:03 mohaidoss