machinelearning
machinelearning copied to clipboard
Column CustomMapping overload
Problem
When I want to apply a custom transform to a single data column in my dataset, I have to provide the input and output types. If I've used TextLoader to load my data without defining schema classes, I now have to go and create new classes for my input and output.
var cols = new []
{
new TextLoader.Column(name: "Text",dataKind:DataKind.String,index:47),
new TextLoader.Column(
name: "Categorical",
dataKind: DataKind.String,
source: new TextLoader.Range[] {
new TextLoader.Range(19),
new TextLoader.Range(20),
new TextLoader.Range(21),
new TextLoader.Range(50),
new TextLoader.Range(59),
new TextLoader.Range(71),
new TextLoader.Range(83)}),
new TextLoader.Column(name:"Label",dataKind:DataKind.String,2)
};
var dataLoader = ctx.Data.CreateTextLoader(columns:cols);
var idv = dataLoader.Load(path);
public class CustomData {public string Text {get;set;}};
Action<CustomData,CustomData> customTransform = (rowIn,rowOut) => {
rowIn.Text = rowOut.Text.ToUpper()
};
Proposed Solution
Create an overload to CustomMapping that takes an InputColumnName and OutputColumnName parameters which perform the lookup and apply the transform to the specified columns.
Action<string,string> capitalize = (colIn,colOut) =>
{
colOut = colIn.ToUpper();
}
var capitalizeTransform = ctx.Transforms.CustomMapping(capitalize,"capitalizeTransform",inputColumnName: "Text", outputColumnName: "CapitalizedText");