'Label' not found
System Information (please complete the following information):
- OS & Version: Windows 11
- ML.NET Version: ML.NET v2.0.1
- .NET Version: NET 7
Describe the bug
An error for missing Label in Schema when trying to load text (csv) without header row.
Row example:
5;0.968795895576477;0.8838793039321899;1.0125187635421753;1.0380022525787354;1.003713607788086;0.8773788213729858;0.7508044838905334;0.7412265539169312;0.7468504905700684;0.7589845061302185;0.7755808234214783;0.7674760818481445;0.6741359829902649;0.6582905054092407;0.6679562926292419;0.6805443167686462;0.6613132357597351;0.48050951957702637;0.5232967138290405;0.5599182844161987;0.522437334060669;0.5111487507820129;0.5027106404304504
To Reproduce
var ctx = new MLContext(1);
var opts = new TextLoader.Options
{
HasHeader = false,
Columns = new[]
{
new TextLoader.Column("Label", DataKind.UInt32, 0),
new TextLoader.Column("Features", DataKind.Single, 1, 29)
},
Separators = new[] {';'},
};
var loader = ctx.Data.CreateTextLoader(opts);
var data = loader.Load(@"C:\test.csv");
var trainValidationData = ctx.Data.TrainTestSplit(data, testFraction: 0.2);
var pipeline = ctx.Auto()
.Featurizer(data)
.Append(ctx.Transforms.Conversion.MapValueToKey("Label"))
.Append(ctx.Auto().MultiClassification());
var xx = ctx.Auto()
.CreateExperiment()
.SetPipeline(pipeline)
.SetMulticlassClassificationMetric(MulticlassClassificationMetric.MacroAccuracy)
.SetTrainingTimeInSeconds(60)
.SetDataset(trainValidationData)
.Run();
Removing Featurizer does not produce any different result, same error.
var pipeline = ctx.Transforms.Conversion.MapValueToKey("Label")
.Append(ctx.Auto().MultiClassification());
Generates error:
System.AggregateException : One or more errors occurred. (label column 'Label' not found (Parameter 'schema'))
----> System.ArgumentOutOfRangeException : label column 'Label' not found (Parameter 'schema')
Data:
ML_IsMarked: 1
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at Microsoft.ML.AutoML.AutoMLExperiment.Run()
Loaded data looks as expected:
Expected behavior
Loading schema for AutoML when Label has been specified.
Might be due to missing header row and/or not using InferColumns. Schema looks fine on runtime manual inspection, am I missing something?
I am running into a similar problem. In my case, the experiment uses Binary Classification.
It seems that whatever dataview the evaluator sees, does not have the Label column.
System.ArgumentOutOfRangeException: label column 'Label' not found (Parameter 'schema')
at Microsoft.ML.Data.RoleMappedSchema.MapFromNames(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
at Microsoft.ML.Data.RoleMappedSchema..ctor(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
at Microsoft.ML.Data.RoleMappedData..ctor(IDataView data, Boolean opt, KeyValuePair`2[] roles)
at Microsoft.ML.Data.BinaryClassifierEvaluator.Evaluate(IDataView data, String label, String score, String predictedLabel)
I dug deeper into AutoML code and found that label column for the evaluator is always 'label' (lower case).
I renamed "Label" to "label" everywhere and that fixed this issue