machinelearning icon indicating copy to clipboard operation
machinelearning copied to clipboard

Column CustomMapping overload

Open luisquintanilla opened this issue 3 years ago • 0 comments

Problem

When I want to apply a custom transform to a single data column in my dataset, I have to provide the input and output types. If I've used TextLoader to load my data without defining schema classes, I now have to go and create new classes for my input and output.

var cols = new [] 
{
    new TextLoader.Column(name: "Text",dataKind:DataKind.String,index:47),
    new TextLoader.Column(
        name: "Categorical",
        dataKind: DataKind.String,
        source: new TextLoader.Range[] {
            new TextLoader.Range(19),
            new TextLoader.Range(20),
            new TextLoader.Range(21),
            new TextLoader.Range(50),
            new TextLoader.Range(59),
            new TextLoader.Range(71),
            new TextLoader.Range(83)}),
    new TextLoader.Column(name:"Label",dataKind:DataKind.String,2)
};

var dataLoader = ctx.Data.CreateTextLoader(columns:cols);
var idv = dataLoader.Load(path);

public class CustomData {public string Text {get;set;}};

Action<CustomData,CustomData> customTransform = (rowIn,rowOut) => {
    rowIn.Text = rowOut.Text.ToUpper()
};

Proposed Solution

Create an overload to CustomMapping that takes an InputColumnName and OutputColumnName parameters which perform the lookup and apply the transform to the specified columns.

Action<string,string> capitalize = (colIn,colOut) => 
{
    colOut = colIn.ToUpper();
}

var capitalizeTransform = ctx.Transforms.CustomMapping(capitalize,"capitalizeTransform",inputColumnName: "Text", outputColumnName: "CapitalizedText");

luisquintanilla avatar Sep 20 '22 02:09 luisquintanilla