MAPIE icon indicating copy to clipboard operation
MAPIE copied to clipboard

[ENHANCEMENT] Multi-output regression

Open vtaquet opened this issue 4 years ago • 9 comments

Following this paper : http://proceedings.mlr.press/v128/messoudi20a.html

vtaquet avatar Oct 07 '21 07:10 vtaquet

We have a better calibrated version (using copulas to better adjust target-wise confidence levels): https://arxiv.org/abs/2101.12002. Code available here: https://github.com/M-Soundouss/CopulaConformalMTR

sdestercke avatar Dec 17 '21 10:12 sdestercke

Interesting, thanks for sharing ! We will definitely have a look when we will start this development.

gmartinonQM avatar Dec 17 '21 10:12 gmartinonQM

Hi @sdestercke, your work is definitely very interesting ! What do you think about giving it more visibility by contributing to MAPIE ?

Here are my suggestions :

  • Create a new multi_output_regression.py module
  • Create a class MapieMultiOutputRegressor in it, with at least an __init__, a fit and a predict methods
  • There should be at least two options in the __init__ : cv and method. Following your papers, cv="prefit" is the only valid option for now (split-conformal), but we could imagine extension to cross-validation in the future. method could be "single", "mutli" or "copula_empirical" for example (reusing the notations of your two papers). Start by picking the simplest one.
  • Your output predictions returned by predict should be numpy arrays of indicative shape (n_samples, 3, n_targets), the 3 standing for prediction and lower/upper bound.
  • Create a small unit test illustrating the use of your class on a minimal toy dataset inmapie/tests/test_multi_output_regression.py

Beware that MAPIE, as its name indicates, is model-agnostic, so it should not be married with deep learning libraries like tensorflow/pytorch in the code base (even if any user can still use tensorflow models within MAPIE).

You can follow the code of MapieRegressor as a template to follow.

In case you have any doubts, would be very pleased to answer your questions in this discussion !

What do you think ?

gmartinonQM avatar Jan 17 '22 10:01 gmartinonQM

Hi @gmartinonQM,

Many thanks for the suggestion and for the helpful tips on how to add our method to the MAPIE library.

I may not have the time right now to do it, but I think that @M-soundouss could be equally if not more interested, with maybe a bit more time on her hand right now! There is also a priori no big issues in making it model agnostic. Two question points that directly pops up into my mind:

  • Basically, what we extend is the normalized version of conformal regression scores, so my guess is that the normalizing value should also be model-agnostic in the code?

  • Would it be okay to use the copulae python library within the code?

sdestercke avatar Jan 17 '22 14:01 sdestercke

Hi @sdestercke, to answer your questions :

  • yes, normalizing value should be a user choice (but you can suggest a default model to use, within sklearn preferentially)
  • yes, feel free to add the copulae dependency, at least in setup.py (INSTALL_REQUIRES) to ensure that the dependency is resolved for the python package, and in the dev environment environment.dev.yml for developers

gmartinonQM avatar Jan 17 '22 14:01 gmartinonQM

Any news about this @sdestercke @M-Soundouss ?

gmartinonQM avatar Mar 11 '22 08:03 gmartinonQM

@gmartinonQM Not so far, @M-Soundouss is writing her thesis, which is time-consuming. Helping in implementing multi-output regression for MAPIE is still in our to-do list, though.

sdestercke avatar Mar 11 '22 15:03 sdestercke

Hi @sdestercke and @M-Soundouss, do you have any news about this issue ?

vincentblot28 avatar Feb 23 '23 09:02 vincentblot28

Hello @vincentblot28 ! I started working on it, I'll keep you updated. Thank you!

M-Soundouss avatar Feb 24 '23 16:02 M-Soundouss