NimbusML
NimbusML copied to clipboard
Expose Ensembling from ML.NET
what title says
PR #207 exposed EnsembleClassifier
(multiclass) and EnsembleRegressor
, along with components needed to sample subsets of the data to train each model in the ensemble on, and components to select a subset of the trained model and combine their output to form the ensemble.
Remaining work:
- Expose
BasePredictor
in ML.NET and then in NimbusML so that NimbusML users can specify one or more learner of their choice to use in ensemble, instead of using the defaultLogisticRegressionClassifier
forEnsembleClassifier
andOnlineGradientDescentRegressor
forEnsembleRegressor
.
This would entail rewriting EnsembleTrainerBase
and its derived classes in ML.NET as IEstimator
instead of ITrainer
, and also writing an ITransformer
for them that would be produced by fitting the estimator.
- Expose
EnsembleBinaryClassifier
in NimbusML. Curerntly, binary classification can be done withEnsembleClassifier
but it would be useful to have a specific binary classifier so that users are not restricted to using the multiclass classifiers in NimbusML for binary classification, onceBasePredictor
is exposed.
The reason for not exposing the binary classifier is that NimbusML adds a LabelColumnKeyBooleanConverter
to a Pipeline
, which converts the label to Key, not Boolean. As EnsembleTrainer
is currently implemented in ML.NET (i.e. as ITrainer
), the label goes through type checks, which require it to be Boolean. When implemented as IEstimator
in (1), the label would go through a different series of checks, which aloow it to be Boolean or Key with two Key counts.