machinelearning icon indicating copy to clipboard operation
machinelearning copied to clipboard

Support custom mapping without type parameters

Open klausmh opened this issue 6 years ago • 5 comments

ML.NET supports custom mappings if a source and destination type are specified statically.

Would it also be possible to support this in a dynamic setting, i.e., doing custom mappings from IDataView to IDataView?

klausmh avatar May 22 '19 00:05 klausmh

As someone trying to play w/ ONNX & TF in ML.NET, I also feel this pain.

I'd like a simple way to just say: read from column {A, B, C} and write to {X, Y, Z} while appending to the existing IDataView.

justinormont avatar May 22 '19 00:05 justinormont

When being executed, custom mapping does produce another IDataView from the given IDataView. If your question is that, if those types TSrc and TDst can be dynamic, my answer would be yes.

        public static CustomMappingEstimator<TSrc, TDst> CustomMapping<TSrc, TDst>(this TransformsCatalog catalog, Action<TSrc, TDst> mapAction, string contractName,
                SchemaDefinition inputSchemaDefinition = null, SchemaDefinition outputSchemaDefinition = null)
            where TSrc : class, new()
            where TDst : class, new()
            => new CustomMappingEstimator<TSrc, TDst>(catalog.GetEnvironment(), mapAction, contractName, inputSchemaDefinition, outputSchemaDefinition);

You can create inputSchemaDefinition and outputSchemaDefinition dynamically to define the schema of your TSrc and TDst.

wschin avatar May 24 '19 23:05 wschin

@klausmh the main limitation however, is that the mapping cannot be stateful. It can only map rows to new rows with no external state for the mapping.

For example, it cannot do a pass over the data to train some parameters for the mapping, and then apply the mapping.

artidoro avatar Jun 03 '19 20:06 artidoro

When being executed, custom mapping does produce another IDataView from the given IDataView. If your question is that, if those types TSrc and TDst can be dynamic, my answer would be yes.

        public static CustomMappingEstimator<TSrc, TDst> CustomMapping<TSrc, TDst>(this TransformsCatalog catalog, Action<TSrc, TDst> mapAction, string contractName,
                SchemaDefinition inputSchemaDefinition = null, SchemaDefinition outputSchemaDefinition = null)
            where TSrc : class, new()
            where TDst : class, new()
            => new CustomMappingEstimator<TSrc, TDst>(catalog.GetEnvironment(), mapAction, contractName, inputSchemaDefinition, outputSchemaDefinition);

You can create inputSchemaDefinition and outputSchemaDefinition dynamically to define the schema of your TSrc and TDst.

I am having trouble to pass dynamic classes. any example?

husseinshaib1 avatar Oct 17 '21 11:10 husseinshaib1

I would also appreciate an example of the mapping with dynamic types. Considering that IDataView can be created dynamically using Context.Data.LoadFromTextfile or with DataFrame

var view = DataFrame.LoadCsvFromString(
  myCsvStr,
  header: false,
  separator: ',',
  columnNames: columns.ToArray(),
  dataTypes: columns.Select(o => o.GetType()).ToArray())

It would make sense not to use strict types in the CustomMapping as well.

artemiusgreat avatar Jun 24 '22 00:06 artemiusgreat