NimbusML icon indicating copy to clipboard operation
NimbusML copied to clipboard

ONNX model of ToKey outputs 1 based key, while 0 based expected

Open ganik opened this issue 4 years ago • 1 comments

Repro `from nimbusml.datasets import get_dataset from nimbusml.preprocessing import OnnxRunner, ToKey

iris_df = get_dataset("iris").as_df() iris_df = iris_df.drop(['Label'], axis=1)

transform = ToKey() << {'NewVals': 'Setosa'} print(transform.fit_transform(iris_df)) transform.export_to_onnx("test.onnx", 'com.microsoft.ml') onnx_runner = OnnxRunner(model_file="test.onnx") print(onnx_runner.fit_transform(iris_df))`

Output: Sepal_Length Sepal_Width Petal_Length Petal_Width Species Setosa NewVals 0 5.1 3.5 1.4 0.2 setosa 1.0 0 1 4.9 3.0 1.4 0.2 setosa 1.0 0 2 4.7 3.2 1.3 0.2 setosa 1.0 0 3 4.6 3.1 1.5 0.2 setosa 1.0 0 4 5.0 3.6 1.4 0.2 setosa 1.0 0 .. ... ... ... ... ... ... ... 145 6.7 3.0 5.2 2.3 virginica 0.0 1 146 6.3 2.5 5.0 1.9 virginica 0.0 1 147 6.5 3.0 5.2 2.0 virginica 0.0 1 148 6.2 3.4 5.4 2.3 virginica 0.0 1 149 5.9 3.0 5.1 1.8 virginica 0.0 1

[150 rows x 7 columns] Sepal_Length Sepal_Width Petal_Length ... Species.onnx.0 Setosa.onnx.0 NewVals.onnx.0 0 5.1 3.5 1.4 ... setosa 1.0 1.0 1 4.9 3.0 1.4 ... setosa 1.0 1.0 2 4.7 3.2 1.3 ... setosa 1.0 1.0 3 4.6 3.1 1.5 ... setosa 1.0 1.0 4 5.0 3.6 1.4 ... setosa 1.0 1.0 .. ... ... ... ... ... ... ... 145 6.7 3.0 5.2 ... virginica 0.0 2.0 146 6.3 2.5 5.0 ... virginica 0.0 2.0 147 6.5 3.0 5.2 ... virginica 0.0 2.0 148 6.2 3.4 5.4 ... virginica 0.0 2.0 149 5.9 3.0 5.1 ... virginica 0.0 2.0

[150 rows x 13 columns]

ganik avatar Feb 11 '20 15:02 ganik

Update: It was decided that this issue would not be resolved, as Pandas expects 1 based keys.

Lynx1820 avatar Oct 19 '20 21:10 Lynx1820