[ENHANCEMENT] Multi-output regression
Following this paper : http://proceedings.mlr.press/v128/messoudi20a.html
We have a better calibrated version (using copulas to better adjust target-wise confidence levels): https://arxiv.org/abs/2101.12002. Code available here: https://github.com/M-Soundouss/CopulaConformalMTR
Interesting, thanks for sharing ! We will definitely have a look when we will start this development.
Hi @sdestercke, your work is definitely very interesting ! What do you think about giving it more visibility by contributing to MAPIE ?
Here are my suggestions :
- Create a new
multi_output_regression.pymodule - Create a class
MapieMultiOutputRegressorin it, with at least an__init__, afitand apredictmethods - There should be at least two options in the
__init__:cvandmethod. Following your papers,cv="prefit"is the only valid option for now (split-conformal), but we could imagine extension to cross-validation in the future.methodcould be"single","mutli"or"copula_empirical"for example (reusing the notations of your two papers). Start by picking the simplest one. - Your output predictions returned by
predictshould be numpy arrays of indicative shape(n_samples, 3, n_targets), the3standing for prediction and lower/upper bound. - Create a small unit test illustrating the use of your class on a minimal toy dataset in
mapie/tests/test_multi_output_regression.py
Beware that MAPIE, as its name indicates, is model-agnostic, so it should not be married with deep learning libraries like tensorflow/pytorch in the code base (even if any user can still use tensorflow models within MAPIE).
You can follow the code of MapieRegressor as a template to follow.
In case you have any doubts, would be very pleased to answer your questions in this discussion !
What do you think ?
Hi @gmartinonQM,
Many thanks for the suggestion and for the helpful tips on how to add our method to the MAPIE library.
I may not have the time right now to do it, but I think that @M-soundouss could be equally if not more interested, with maybe a bit more time on her hand right now! There is also a priori no big issues in making it model agnostic. Two question points that directly pops up into my mind:
-
Basically, what we extend is the normalized version of conformal regression scores, so my guess is that the normalizing value should also be model-agnostic in the code?
-
Would it be okay to use the copulae python library within the code?
Hi @sdestercke, to answer your questions :
- yes, normalizing value should be a user choice (but you can suggest a default model to use, within sklearn preferentially)
- yes, feel free to add the copulae dependency, at least in
setup.py(INSTALL_REQUIRES) to ensure that the dependency is resolved for the python package, and in the dev environmentenvironment.dev.ymlfor developers
Any news about this @sdestercke @M-Soundouss ?
@gmartinonQM Not so far, @M-Soundouss is writing her thesis, which is time-consuming. Helping in implementing multi-output regression for MAPIE is still in our to-do list, though.
Hi @sdestercke and @M-Soundouss, do you have any news about this issue ?
Hello @vincentblot28 ! I started working on it, I'll keep you updated. Thank you!