machinelearning icon indicating copy to clipboard operation
machinelearning copied to clipboard

OneDAL FastForest training has an "Array dimensions exceeded supported range" exception

Open 80LevelElf opened this issue 1 year ago • 1 comments

System Information (please complete the following information):

  • OS & Version: Linux Alpine
  • ML.NET Version: ML.NET 3.0
  • .NET Version: .Net 5.0

Describe the bug

Array dimensions exceeded supported range.   at System.Collections.Generic.List`1.set_Capacity(Int32 value)
   at System.Collections.Generic.List`1.AddWithResize(T item)
   at Microsoft.ML.OneDal.OneDalUtils.GetTrainData(IChannel channel, Factory cursorFactory, List`1& featuresList, List`1& labelsList, Int32 numberOfFeatures)
   at Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.TrainCoreOneDal(IChannel ch, Factory cursorFactory, Int32 featureCount)
   at Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.AutoML.BinaryClassificationRunner.Run(TrialSettings settings)
   at Microsoft.ML.AutoML.BinaryClassificationRunner.RunAsync(TrialSettings settings, CancellationToken ct)
   at Microsoft.ML.AutoML.AutoMLExperiment.RunAsync(CancellationToken ct)

We have found this error in internal ML.net logs a lot of times. Looks liks it's not related to train set size (in the case I have copied this error we have only 20 000 training rows)

80LevelElf avatar Dec 22 '23 15:12 80LevelElf

Still happens.. regression.

System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at Microsoft.ML.Trainers.FastTree.Dataset.MapFeatureToFlockAndSubFeature(Int32 feature, Int32& flock, Int32& subfeature)
   at Microsoft.ML.Trainers.FastTree.InternalRegressionTree.PopulateThresholds(Dataset dataset)
   at Microsoft.ML.Trainers.FastTree.FastForestRegressionTrainer.TrainCoreOneDal(IChannel ch, Factory cursorFactory, Int32 featureCount)
   at Microsoft.ML.Trainers.FastTree.FastForestRegressionTrainer.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)

Microsoft.ML.OneDal,0.22.0-preview.24271.1 Is there any benchmark showing that onedal with ml.net is actually faster(when it works)?

superichmann avatar Jul 01 '24 12:07 superichmann