xgboost
xgboost copied to clipboard
Features request - Conformal Prediction
Problem: XGBoost is a great library, but it currently lacks reliable modern uncertainty quantification that is rather easy to implement using conformal prediction. https://github.com/valeman/awesome-conformal-prediction
Feature request - add conformal prediction for regression and classification similar to what other libraries like MAPIE implemented.
Regression - Inductive (split) and Conformalized Quantile Regression https://mapie.readthedocs.io/en/stable/examples_regression/4-tutorials/plot_cqr_tutorial.html
Classification - Venn Abers
https://proceedings.neurips.cc/paper/2015/file/a9a1d5317a33ae8cef33961c34144f84-Paper.pdf
There is also talk by Vovk https://m.youtube.com/watch?v=ksrUJdb2tA8&pp=ygUNVm92ayB2bGFkaW1pcg%3D%3D
Tutorial https://cml.rhul.ac.uk/copa2017/presentations/VennTutorialCOPA2017.pdf
https://github.com/ip200/venn-abers implementation by Ivan Petej
Older implementation by Paolo Tocacceli
https://github.com/ptocca/VennABERS
Venn ABERS demo https://github.com/ptocca/VennABERS-demo
Thank you for raising the issue, yes, I have been looking into it and can try to build one on top of quantile regression. Another direction is based on distribution parameters prediction, which is also possible.
The question is that, since it's a post-hoc method, do we need to build it inside XGBoost? Or is it better to build an independent library that works across all types of models including the ones from XGBoost?
@trivialfis there is big demand for having this inside XGBoost.
@valeman
there is big demand for having this inside XGBoost
What would be the main benefit of having the feature inside XGBoost? Do you see any of the following benefits by having conformal prediction built inside XGBoost (compared to the alternative of implementing as an external library)?
- Ease of use
- Faster performance, as in we get prediction faster (in shorter time).
- Smaller memory footprint
- Enable additional use cases that were not possible if conformal prediction is provided by an external library.
Hi @hcho3,
-
XGBoost would benefit from having reliable uncertainty quantification framework that provides guarantees of calibration.
-
conformal prediction is easy to implement.
-
Having conformal prediction inside XGBoost will make it easy for users to produce models, including in production pipelines.
-
faster performance.
Happy to discuss this in more detail. https://www.linkedin.com/in/valeriy-manokhin-phd-mba-cqf-704731236/
We implemented a version of this ourselves at RedHat. It would be a good addition to the library.
Considering that we need a calibrater for conformal prediction, once we start working on distributed version and GPU version, this can be nontrivial, I think it's best to host it in a different project. Also, I saw the same feature request being opened on other projects as well, implementing it independent of XGB can be more useful than having it exclusive to XGB then do it over again for other projects.
cc @jameslamb for awareness.