NimbusML issues

Error loading a model that was saved with mlnet auto-train

9

**Describe the bug** When using the mlnet auto-train tool to create a model, and then load that model using NimbusML, an exception is being thrown. **To Reproduce** Steps to reproduce...

RokoToken

CV creates incorrect split of user defined transforms.

1

When specifying `split_start='after_transforms'` in `CV.fit()`, the user defined transforms are not split up correctly. See the graph created by the `fit()` call in the code below. It seems like if...

pieths

Numerical categorical columns are not supported

NimbusML only has support for string based categorical columns. Numerical categorical columns (`KeyDataViewType`) which are returned from ML.Net are not converted back to their original representation even though Pandas does...

pieths

User defined transforms drop features if not explicitly specified

Using a transform, which only acts on a subset of the input columns, before a predictor and not explicitly specifying the features to the predictor will only pass the output...

pieths

ONNX model for NgramFeaturizer doesnt output actual tokens

Repro ` from nimbusml.datasets import get_dataset from nimbusml import FileDataStream from nimbusml.preprocessing import OnnxRunner from nimbusml.feature_extraction.text import NGramFeaturizer from nimbusml.feature_extraction.text.extractor import Ngram path = get_dataset("wiki_detox_train").as_filepath() data = FileDataStream.read_csv(path, sep='\t') transformer...

ganik

Add support for batch transfers

Currently, each return value is transferred from managed code to native code one at a time. See `NativeDataInterop.cs`. ```csharp for (int i = 0; i < fillers.Length; i++) { fillers[i].Set();...

pieths

DatasetTransformer throws when is used on its own

Repro: r0 = Pipeline([MinMaxScaler()]) r0.fit(train_df) r1 = Pipeline([DatasetTransformer(r0.model)]) r1.fit_transform(train_df)

ganik

when setting output_topic_word_summary=True, still can not get the topic-words summary in output

1

**Describe the bug** in the LightLda module, I wanna get the topic-words summary outputs, setting: output_topic_word_summary=True, num_summary_term_per_topic=20 but i can not get the topic words summary in outputs. the only...

Wuliyuanulb

P1

Supporting pathlib's Path objects in FileDataStream

5

Fixes #269 . `pathlib`'s Path objects can be converted to strings just by casting, and vice versa. I added a check in `FileDataStream`'s init function to convert a Path object...

pnshinde

Pipeline.get_fit_info shows incorrect columns

The inputs and outputs which are produced by `Pipeline.get_fit_info` are not valid. See `inputs`, `outputs` and `schema_after` in the `RangeFilter` section of the output: ```python train_data = {'c1': [2, 3,...

pieths

NimbusML
NimbusML copied to clipboard

Metadata

Error loading a model that was saved with mlnet auto-train

CV creates incorrect split of user defined transforms.

Numerical categorical columns are not supported

User defined transforms drop features if not explicitly specified

ONNX model for NgramFeaturizer doesnt output actual tokens

Add support for batch transfers

DatasetTransformer throws when is used on its own

when setting output_topic_word_summary=True, still can not get the topic-words summary in output

Supporting pathlib's Path objects in FileDataStream

Pipeline.get_fit_info shows incorrect columns

← Metadata

Owner

Metadata

NimbusML NimbusML copied to clipboard

Metadata

← Metadata

Owner

Metadata

NimbusML
NimbusML copied to clipboard