CatBoostLSS icon indicating copy to clipboard operation
CatBoostLSS copied to clipboard

Question: Catboost Quantile Regression

Open m1sta opened this issue 4 years ago • 1 comments

Would you be able to shed some light on when to use the distribution fitting features of CatBoostLSS versus the quantile regression feature of the standard CatBoost package?

m1sta avatar Jan 26 '21 22:01 m1sta

@m1sta Thanks for the question.

Choosing between the two depends on the problem at hand I'd say:

  • the quantile regression feature of catboost allows you to model different parts of the conditional distribution as a function of covariates. Hence, it allows you to, e.g., model and analyze the relationship between extreme quantiles, say 5% and 95%. As far as I know, however, there is no way of preventing quantile crossing, which is that for a dense set of quantiles, say 90% and 92%, the 92% quantile can be estimated to be below the 90% quantile.
  • The CatBoostLSS is a parametric approach and allows you to analyse the relationship between covariates and all distributional parameters, e.g., mean and variance and hence to also better understand heteroscedasticity. It provides a richer interpretation and analysis of the conditional distribution. Also, since you can draw samples from the fitted distribution, you can also model different quantiles. However, in contrast to quantile regression, you cannot relate different quantiles to covariates.

So in summary, both approaches allow to go beyond modelling the conditional mean. But the model of choice depends on your problem at hand.

Hope that helps. Feel free to come back in case of further questions.

StatMixedML avatar Jan 31 '21 08:01 StatMixedML