onnxmltools icon indicating copy to clipboard operation
onnxmltools copied to clipboard

Missing converter for HashingTF

Open sansr opened this issue 4 years ago • 2 comments

Hello everyone!

I am trying to convert an instance of HashingTF sparkml transformer. When I invoke convert_sparkml function I get an error that says that 'pyspark.ml.feature.HashingTF' is not supported.

I was looking into the source code and I discovered that 'pyspark.ml.feature.HashingTF' is a transformer built into get_sparkml_operator_name (inside ops_names.py) but later it doesn't appear into the map created in get_input_names function (that is inside ops_input_output.py), that is where it is check if a transformer/estimator is valir or not. Does this have any explanation why it is not supported? Or is a bug?

Any help is well appreciated. Thank you!

sansr avatar Feb 19 '21 12:02 sansr

Hi @sansr , Did you find the solution, I too have ran into same issue. I can see that HashingTF is present as a key in the build_sparkml_operator_name_map() in ops_names.py, yet it's throwing keyerror.

bipin2295 avatar Jul 02 '21 03:07 bipin2295

There is no easy way to do hashes with ONNX. There is no dedicated operator and hashing with current operator is not straightforward. This implementation may take some time.

xadupre avatar Aug 24 '21 08:08 xadupre