LightAutoML icon indicating copy to clipboard operation
LightAutoML copied to clipboard

Support for Langevin Parameter in CatBoost Tuning

Open EmotionEngineer opened this issue 1 year ago • 2 comments

🚀 Feature Request

Motivation

I would like to introduce support for the langevin=True parameter in LightAutoML. This parameter is the Stochastic Gradient Langevin Boosting (SGLB) method, which is a powerful and efficient machine learning framework capable of handling a wide range of loss functions and providing provable generalization guarantees. The method is based on a special form of the Langevin diffusion equation specifically designed for gradient boosting. This allows us to theoretically guarantee global convergence even for multimodal loss functions, while standard gradient boosting algorithms can only guarantee local optimum Paper

Proposal

I propose that LightAutoML support the langevin=True parameter during hyperparameter tuning of CatBoost models. This would allow users to leverage the benefits of SGLB when tuning CatBoost models using LightAutoML.

Alternatives

As an alternative, users could manually set the langevin parameter when creating a CatBoost instance. However, this could be less convenient and efficient than having LightAutoML automatically tune the parameters.

Additional context

I have successfully used the langevin=True parameter during a Kaggle competition. This experience has shown me the potential benefits of this parameter, and I believe it would be beneficial to have this feature in LightAutoML.

More details: #5 Solution

EmotionEngineer avatar Nov 10 '23 18:11 EmotionEngineer

@EmotionEngineer LightAutoML solution has almost the same score as yours, proof link, but we'll take a look.

alexmryzhkov avatar Nov 13 '23 21:11 alexmryzhkov

@alexmryzhkov Hi! I agree with you, but it is still possible to get a marginal gain and it may be worth a try. Thank you!

EmotionEngineer avatar Nov 15 '23 11:11 EmotionEngineer