machinelearning
machinelearning copied to clipboard
Unable to remove SdcaLogisticRegressionOva from AutoML Multiclassification Experiment
System Information (please complete the following information):
- OS & Version: Windows 11
- ML.NET Version: v3.0.1 & AutoML 0.21.1
- .NET Version: 8.0
Describe the bug When creating an AutoML Multiclassification Experiment, you are unable to remove the trainer "SdcaLogisticRegressionOva".
To Reproduce Steps to reproduce the behavior:
- Create a Multiclass experiment settings object
- Iterate on settings.Trainers and remove all trainers that are not "LightGbm" or "FastForest"
- Create a Multiclass Progress Reporter that will output the TrainerName used.
- Use this replace command to remove the currently bugged (3.0.1 and 0.21.1) TrainerName value:
TrainerName.Replace("Multi", "").Replace("ReplaceMissingValues", "").Replace("Concatenate", "").Replace("Unknown", "").Replace("=>", "");
- Run experiment and monitor names.
Expected behavior One of the first three models will include the unremovable trainer.
Screenshots, Code, Sample Projects
MulticlassExperimentSettings settings = new MulticlassExperimentSettings()
{
OptimizingMetric = optimizeMetric,
MaxExperimentTimeInSeconds = experimentTime,
CacheDirectoryName = cacheDir,
CancellationToken = cts.Token,
CacheBeforeTrainer = CacheBeforeTrainer.On
};
bool keptLightGBM = false;
foreach (var trainer in settings.Trainers.ToList())
{
if (!trainer.ToString().ToUpperInvariant().Contains("LIGHTGBM") && !trainer.ToString().ToUpperInvariant().Contains("FASTFOREST"))
{
settings.Trainers.Remove(trainer);
Console.WriteLine("Removed Trainer: " + trainer.ToString());
}
//else
//{
// if (keptLightGBM)
// {
// settings.Trainers.Remove(trainer);
// Console.WriteLine("Removed Extra "LightGbm" Trainer: " + trainer.ToString());
// }
// else
// keptLightGBM = true;
//}
}
MulticlassClassificationExperiment experiment = context.Auto().CreateMulticlassClassificationExperiment(settings);
ExperimentResult<MulticlassClassificationMetrics> result;
result = experiment.Execute(trainData, splitTestData, columnInformation, null, new MulticlassProgressReporter() { labelColumnName = label, CacheDir = cacheDir, ExperimentTime = DateTime.Now });
This code produces this output:
Additional context If you only leave one LightGbm as the only trainer, then AutoML uses the "SdcaLogisticRegressionOva" every other time.
The trainer "SdcaLogisticRegressionOva" does not appear in the list after creating a settings object which is supposed to populate the list with all values. Also, if you iterate on list of auto populated trainers, two items appear with the name "LightGbm".
Last, when I peek the definition of Microsoft.ML.AutoML.MulticlassClassificationTrainer, I get this list which also doesn't have "SdcaLogisticRegressionOva" in the list.
// Decompiled with JetBrains decompiler
// Type: Microsoft.ML.AutoML.MulticlassClassificationTrainer
// Assembly: Microsoft.ML.AutoML, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51
// MVID: 5D7A79B7-CF20-433B-A534-1ED92C335230
// Assembly location: C:\Users\xxxx\.nuget\packages\microsoft.ml.automl\0.21.1\lib\netstandard2.0\Microsoft.ML.AutoML.dll
// XML documentation location: C:\Users\xxxx\.nuget\packages\microsoft.ml.automl\0.21.1\lib\netstandard2.0\Microsoft.ML.AutoML.xml
#nullable disable
namespace Microsoft.ML.AutoML
{
/// <summary>
/// Enumeration of ML.NET multiclass classification trainers used by AutoML.
/// </summary>
public enum MulticlassClassificationTrainer
{
/// <summary>
/// <see cref="T:Microsoft.ML.Trainers.OneVersusAllTrainer" /> using <see cref="T:Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer" />.
/// </summary>
FastForestOva,
/// <summary>
/// <see cref="T:Microsoft.ML.Trainers.OneVersusAllTrainer" /> using <see cref="T:Microsoft.ML.Trainers.FastTree.FastTreeBinaryTrainer" />.
/// </summary>
FastTreeOva,
/// <summary>
/// See <see cref="T:Microsoft.ML.Trainers.LightGbm.LightGbmMulticlassTrainer" />.
/// </summary>
LightGbm,
/// <summary>
/// See <see cref="T:Microsoft.ML.Trainers.LbfgsMaximumEntropyMulticlassTrainer" />.
/// </summary>
LbfgsMaximumEntropy,
/// <summary>
/// <see cref="T:Microsoft.ML.Trainers.OneVersusAllTrainer" /> using <see cref="T:Microsoft.ML.Trainers.LbfgsLogisticRegressionBinaryTrainer" />.
/// </summary>
LbfgsLogisticRegressionOva,
/// <summary>
/// See <see cref="T:Microsoft.ML.Trainers.SdcaMaximumEntropyMulticlassTrainer" />.
/// </summary>
SdcaMaximumEntropy,
}
}