NimbusML
NimbusML copied to clipboard
Pipeline.get_fit_info shows incorrect columns
The inputs and outputs which are produced by Pipeline.get_fit_info are not valid. See inputs, outputs and schema_after in the RangeFilter section of the output:
train_data = {'c1': [2, 3, 4, 5],
'c2': [20, 30.8, 39.2, 51]}
train_df = pd.DataFrame(train_data).astype({'c1': np.float32,
'c2': np.float32})
pipeline = Pipeline([
RangeFilter(min=0, max=10, columns=['c1']),
])
pipeline.fit(train_df)
info = pipeline.get_fit_info(train_df)
import pprint
pprint.pprint(info)
which outputs,
([{'name': None,
'operator': None,
'outputs': ['c1', 'c2'],
'schema_after': ['c1', 'c2'],
'type': 'start'},
{'inputs': ['c', '1'],
'name': 'RangeFilter',
'operator': RangeFilter(columns=['c1'], complement=False, include_max=None,
include_min=True, max=10, min=0),
'outputs': ['c', '1'],
'schema_after': ['c1', 'c2', 'c', '1'],
'type': 'transform'}],
[<nimbusml.internal.utils.entrypoints.EntryPoint object at 0x00000286956BBEB8>])