machinelearning
machinelearning copied to clipboard
IndexOutOfRangeException at Microsoft.ML.Data.BufferBuilder`1.AddFeature
System Information (please complete the following information):
- OS & Version: Windows 11 [Version 10.0.22000.376]
- ML.NET Version: ML.NET v1.7.0
- .NET Version: .NET 6.0
- ML.NET Model Builder: ML.NET Model Builder 2022 v16.9.2.2205303
Describe the bug
I get IndexOutOfRangeException when calling PredictionEngine's Predict method. The exception is thrown in the AddFeature method of the BufferBuilder class.
at Microsoft.ML.Data.BufferBuilder`1.AddFeature(Int32 index, T value)
at Microsoft.ML.Transforms.NormalizeTransform.AffineColumnFunction.Sng.ImplVec.FillValues(VBuffer`1& input, BufferBuilder`1 bldr, Single[] scale)
at Microsoft.ML.Transforms.NormalizeTransform.AffineColumnFunction.Sng.ImplVec.<>c__DisplayClass5_0.<GetGetter>b__0(VBuffer`1& dst)
at Microsoft.ML.Data.TypedCursorable`1.TypedRowBase.<>c__DisplayClass8_0`1.<CreateDirectVBufferSetter>b__0(TRow row)
at Microsoft.ML.Data.TypedCursorable`1.TypedRowBase.FillValues(TRow row)
at Microsoft.ML.Data.TypedCursorable`1.RowImplementation.FillValues(TRow row)
at Microsoft.ML.PredictionEngineBase`2.FillValues(TDst prediction)
at Microsoft.ML.PredictionEngine`2.Predict(TSrc example, TDst& prediction)
at Microsoft.ML.PredictionEngineBase`2.Predict(TSrc example)
at BBD_SleepLogger.MLModel_IsAttached.Predict(ModelInput input) in C:\Work\BioBalanceDetector\Software\Source\BBDProto08\BBD.SleepLogger\MLModel_IsAttached.consumption.cs:line 30809
at BBD.SleepLogger.Program.EvaluateIndicators(ILogger logger, Int32 index, FftData inputData) in C:\Work\BioBalanceDetector\Software\Source\BBDProto08\BBD.SleepLogger\Program.cs:line 610
at BBD.SleepLogger.Program.<>c__DisplayClass26_2.<DataAcquisition_SamplesReceived>b__1() in C:\Work\BioBalanceDetector\Software\Source\BBDProto08\BBD.SleepLogger\Program.cs:line 467
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b__272_0(Object obj)
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
To Reproduce Steps to reproduce the behavior:
- Go to Solution Explorer, select your project
- Right click, Add, Machine Learning Project
- Set your data source, and labels, run the training to have the generated code
- Consume the data by using the generated code, fill the
ModelInputwith values and callMLModel.Predit(ModelInput)method
Expected behavior
I was expecting to have a ModelOutput object returned, but I got the above exception instead.
It would be also great to have a more detailed exception to help the bug hunting if it is caused by misconfiguration on the user's part.
Screenshots, Code, Sample Projects

Additional context Probably not a critical information, but my data source is a ~360 MB big CSV file with 5120 float feature columns, 6 label columns (of which 5 are ignored for the training) and it has 5500 rows.
Update: I'm not 100% sure, but it looks like I get this exception when I use SdcaLogisticRegressionOva, LbfgsLogisticRegressionOva or LbfgsMaximumEntropyMulti, but it works fine with FastTreeOva, FastForestOva and LightGbmMulti.
@JakeRadMSFT think this is due to how you guys are building the pipeline itself? Or do we need to go do into the ML.NET code itself?
I used Visual Studio 2022 v17.0.4 with the ML.NET Model Builder 2022 v16.9.2.2205603 extension with my C# console application that includes these mbconfigs to generate the ML models: https://drive.google.com/file/d/1CZJitEjEd3GEhBbHZUMOgEbRg3GaPsMD/view?usp=sharing
Here is the data that I used for training and for the model accuracy measurements: https://drive.google.com/file/d/1C2Of2UIHN2y7l5J-SvEvacFhTAHMCCcm/view?usp=sharing
To get the exception, you need to stop the training when SdcaLogisticRegressionOva, LbfgsLogisticRegressionOva or LbfgsMaximumEntropyMulti has best accuracy.
Hi guys, did you have the chance to look into this? Do you need any more data from me to reproduce the problem?
Until then, is there any workaround for this issue, can I remove some of the training algorithms from the training loop of the model builder? Or is there a way to manually choose the algorithm for the code/model generator in model builder after training?
@LittleLittleCloud didn't we see this issue somewhere else? What was the root cause?
@LittleLittleCloud can you also share with @andrasfuchs how to remove training algorithms from AutoML?
I'll take a look, meanwhile, to disable trainers you can refer this comment https://github.com/dotnet/machinelearning-modelbuilder/issues/1998#issuecomment-1026240486