machinelearning icon indicating copy to clipboard operation
machinelearning copied to clipboard

Please support XGBoost i.e. gradient boosting machines

Open AceHack opened this issue 7 years ago • 5 comments

This is the 2nd most popular model on Kaggle.

Thanks.

AceHack avatar Nov 06 '18 02:11 AceHack

@TomFinley what are your thoughts on this? We already have other tree based learning algorithms in ML.NET, but it is true that XGBoost is a very popular one.

artidoro avatar Nov 06 '18 04:11 artidoro

Hi @artidoro (and @AceHack ). We do wrap it internally, we just haven't ported over the wrapper. I'm not sure why; most probably simply lack of time and opportunity, since of course we've had many other things to do. I agree it would be a very nice thing to do.

One potential barrier is that the distribution story externally will be somewhat difficult. Our current policy, which I think is good, is that to include it in the "official" ML.NET the learner must work on Windows and Linux and Mac. xgboost runs on all those platforms, but I do not see that there is a nuget containing them we can easily consume, which means that we're either in the business of building it ourselves (as we did with LibMF and its subgit), or we somehow arrange for a nuget to be published. LightGBM is the closest analogy to the latter approach I can imagine, but the situation is somewhat different.

But on the whole not impossible. Just a number of problems that need to be solved, though I strongly agree we should.

The fact that there are multiple learners using the same basic technology (in this case trees) is not inherently problematic, and certainly not a reason to ship something as popular as xgboost.

TomFinley avatar Nov 06 '18 06:11 TomFinley

The demand is there to port it to ML.NET, and for that reason it's useful to port.

That said, I don't see XGBoost winning very often vs. our existing tree based models. Perhaps a newer version will show gains, or perhaps a round of optimizing our default hyperparameters is in order.

justinormont avatar Nov 06 '18 07:11 justinormont

As an aside, XGBoost does have a lot of features, some of which we don't support in FastTree. It would also be great to understand which of those features we should support in FastTree.

rogancarr avatar Feb 07 '19 16:02 rogancarr

I think with XGBoost 2.0, it's ahead in a lot of metrics and would be highly useful.

JohnGalt1717 avatar Apr 09 '24 12:04 JohnGalt1717