machinelearning-samples
machinelearning-samples copied to clipboard
I sometimes get this with the regression example: System.InvalidOperationException: Cannot hold covariance matrix in memory
Any idea what could be causing this. My training data is not very big. 40K rows.
Running AutoML regression experiment for 60 seconds...
| Trainer RSquared Absolute-loss Squared-loss RMS-loss Duration |
|1 LightGbmRegression 0.9930 140404.81 42178224023.09 205373.38 9.2 |
|2 FastTreeRegression 0.9934 136725.85 39940180886.08 199850.40 8.8 |
|3 FastTreeTweedieRegression 0.9913 142834.10 52816241323.19 229817.84 9.2 |
|4 FastForestRegression 0.9091 538589.21 551116053939.22 742371.91 9.4 |
Exception during AutoML iteration: System.InvalidOperationException: Cannot hold covariance matrix in memory with 94459 features
at Microsoft.ML.Trainers.OlsTrainer.TrainCore(IChannel ch, Factory cursorFactory, Int32 featureCount)
at Microsoft.ML.Trainers.OlsTrainer.TrainModelCore(TrainContext context)
at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Trainers.TrainerEstimatorBase
2.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String groupId, String labelColumn, IMetricsAgent
1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, IChannel logger)
|6 LightGbmRegression 0.9295 447071.45 427662770504.29 653959.30 9.4 |
|7 FastTreeRegression 0.2663 1336968.13 4450480773603.34 2109616.26 7.0 |
|8 FastTreeTweedieRegression 0.9923 133275.19 46493327388.81 215623.11 12.2 |
Top models ranked by R-Squared -- | Trainer RSquared Absolute-loss Squared-loss RMS-loss Duration | |1 FastTreeRegression 0.9934 136725.85 39940180886.08 199850.40 8.8 | |2 LightGbmRegression 0.9930 140404.81 42178224023.09 205373.38 9.2 | |3 FastTreeTweedieRegression 0.9923 133275.19 46493327388.81 215623.11 12.2 |
Hi @tkdogan
Sorry you ran into this. Which example do you get the error on?
The ordinary least squares regression (OSLR) trainer is memory bound and allocates O(N^2) memory for the features: https://github.com/dotnet/machinelearning/blob/712c3ec0745f45b93e394f8e333deaa5da4f2737/src/Microsoft.ML.Mkl.Components/OlsLinearRegression.cs#L180-L182
I wouldn't be concerned, the AutoML code ignores the failure of that model and continues optimizing the remaining trainers.
If you're using the AutoML API, you can add a feature selection step as a pre-featurizer -- https://github.com/dotnet/machinelearning-samples/blob/5831bdd9bea8e42e1d3e4967486653a5df1abe4c/samples/csharp/getting-started/AdvancedExperiment_AutoML/README.md#step-3-add-a-pre-featurizer
I ended up reducing dimensionality and the problem went away. Thanks for explaining this and good to know that AutoML will ignore those failures.