NimbusML icon indicating copy to clipboard operation
NimbusML copied to clipboard

Onnx export of ColumnSelector doesn't drop input columns

Open antoniovs1029 opened this issue 4 years ago • 0 comments

This issue in NimbusML is pretty much the same issue I created on ML.NET https://github.com/dotnet/machinelearning/issues/4970 but wanted to create this one here just for the record.

Notice that the OnnxRunner correctly drops "case2" but not all the other input columns.

Code

import numpy
from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing import OnnxRunner
from nimbusml.preprocessing.schema import ColumnSelector, ColumnDuplicator
from data_frame_tool import DataFrameTool as DFT

path = get_dataset('infert').as_filepath()
dataset = FileDataStream.read_csv(path, sep=',',
                               numeric_dtype=numpy.float32,
                               names={0: 'row_num', 5: 'case'})

dataset = dataset.to_df()

pipeline = Pipeline([
    ColumnDuplicator(columns={'case2': 'case'}),
    ColumnSelector(columns=['age']),
])


print("\n\nML.NET RESULT")
result_expected = pipeline.fit_transform(dataset)
print(result_expected)

print("\n\nORT RESULT")
onnx_path = "C:\\Users\\anvelazq\Desktop\\is25colsel\\colsel.onnx"
pipeline.export_to_onnx(onnx_path, 'com.microsoft.ml')
onnxrunner = OnnxRunner(model_file=onnx_path)
result_onnx = onnxrunner.fit_transform(dataset)
print(result_onnx)

print("\n\nONNX RUNNER RESULT")
df_tool = DFT(onnx_path)
result_ort = df_tool.execute(dataset, [])
print(result_ort)

Output

ML.NET RESULT
      age
0    26.0
1    42.0
2    39.0
3    34.0
4    35.0
..    ...
243  31.0
244  34.0
245  35.0
246  29.0
247  23.0

[248 rows x 1 columns]


ORT RESULT
     row_num education   age  parity  induced  case  spontaneous  stratum  pooled.stratum
0        1.0    0-5yrs  26.0     6.0      1.0   1.0          2.0      1.0             3.0
1        2.0    0-5yrs  42.0     1.0      1.0   1.0          0.0      2.0             1.0
2        3.0    0-5yrs  39.0     6.0      2.0   1.0          0.0      3.0             4.0
3        4.0    0-5yrs  34.0     4.0      2.0   1.0          0.0      4.0             2.0
4        5.0   6-11yrs  35.0     3.0      1.0   1.0          1.0      5.0            32.0
..       ...       ...   ...     ...      ...   ...          ...      ...             ...
243    244.0   12+ yrs  31.0     1.0      0.0   0.0          1.0     79.0            45.0
244    245.0   12+ yrs  34.0     1.0      0.0   0.0          0.0     80.0            47.0
245    246.0   12+ yrs  35.0     2.0      2.0   0.0          0.0     81.0            54.0
246    247.0   12+ yrs  29.0     1.0      0.0   0.0          1.0     82.0            43.0
247    248.0   12+ yrs  23.0     1.0      0.0   0.0          1.0     83.0            40.0

[248 rows x 9 columns]


ONNX RUNNER RESULT
     age.output
0          26.0
1          42.0
2          39.0
3          34.0
4          35.0
..          ...
243        31.0
244        34.0
245        35.0
246        29.0
247        23.0

[248 rows x 1 columns]

antoniovs1029 avatar Apr 10 '20 23:04 antoniovs1029