Description

Mapie was unable to perform confidence estimation on binary classification problems. To address this issue, I have implemented the mondrian conformal as a method of the MapieClassifier. This method is described in detail on page 5 of this paper, but in essence it uses the quantiles of each class to determine inclusion in the prediction set, as opposed to one quantile found from both classes. This method is not constrained to binary classification, and should work for imbalanced multiclass problems as well.

Closes #216

Type of change

Please remove options that are irrelevant.

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

[x] Ran the make test command, and all tests passed

I also tested the changes on a workflow where mapie was included, and altering the method to mondrian gave similar results.

Checklist

[x] I have read the contributing guidelines
[x] I have updated the HISTORY.rst and AUTHORS.rst files
[x] Linting passes successfully : make lint
[x] Typing passes successfully : make type-check
[x] Unit tests pass successfully : make tests
[x] Coverage is 100% : make coverage
[x] Documentation builds successfully : make doc

I do unfortunately not have much experience writing tests, and do not know the best way to do this, so if anyone can assist on that front with help or advice I would be grateful.

Nov 02 '22 11:11 adamzenith

Hey, I have tried to read the error logs of the failed tests, but they seem to be failing at the level of installing numpy, and I cannot see a reason as to why this happens. If anyone has some advice so that I can fix it that would be much appreciated.

Nov 30 '22 09:11 adamzenith

Can we get this merged? would be super useful

Feb 28 '23 14:02 CihanDogan94

Hello @adamzenith,

Thank you for submitting your pull request to propose the Mondrian Conformal Predictor. I have read with interest your modifications and proposals to implement this method. I hope I understood correctly and that the correction elements I bring you will be relevant. Don't hesitate to share your feedback with me!

1. Your PR in a nutshell

You have proposed an implementation of the Mondrian Conformal Predictor as a method of the MapieClassifier.

The goal of this method is to ensure a conditional coverage of $1-\alpha$ for each class by computing the $1-\alpha$ quantile of the conformal scores for each class to determine their inclusion in the prediction set.
As you stated, this method is not limited to binary classification and should also work for unbalanced multiclass problems.

2. Our feedback on the PR

We believe that the Mondrian Conformal Predictor could be a good enhancement in MAPIE as it has been mentioned and popularized in related work on drug discovery. However, at this time, we lack evidence for comparison with existing methods in MAPIE as proof of the compelling value of using this method in specific use cases. We need concrete examples (in jupyter notebooks for example) that demonstrate that Mondrian Conformal Predictor is better than other methods in MAPIE for solving binary or unbalanced multi-class problems. This will be a demonstration not only for us but for all MAPIE users. We invite you to consult the existing notebooks to help you.

3. Additional comments to improve your code

My suggestions are about modifications to make your code as generic as possible.

I noticed that you have added elements that work specifically on your development settings and are therefore not intended for generic use in MAPIE (as in .gitignore and Makefile). This is not a problem in itself for code execution, but we prefer to keep the code as generic as possible.
In the same vein, you have adapted the compute_quantiles function with a new parameter named mondrian (in the utils.py file). Even if exceptions exist, we prefer to continue to implement generic functions that do not depend on external method attributes, especially when these choices impact the size and shape of the output (since as many quantiles as classes are computed when mondrian=True, whereas only one quantile is computed with mondrian=False).

4. Actions to be taken

I propose a list of actions to help you improve your proposal and help us integrate it into MAPIE:

Propose a notebook that demonstrates the value of using the Mondrian Conformal Predictor in place of the other multi-class conformal predictors proposed in MapieClassifier.
Delete non generic settings in .gitignore and Makefile files (related to your virtual environment mapieenv).
Implement a new function named compute_class_quantiles which performs the same function as compute_quantiles with the parameter mondrian=True in the utils.py file.
Correct typing errors (such as line breaks in mapie/classification.py).

I remain available if you have any questions and thank you in advance for your feedback.

Mar 07 '23 09:03 thibaultcordier

I tested this out myself and it worked well. Nice job.

Mar 10 '23 15:03 GabeNicholson

Would be very helpfull!!

Oct 09 '23 03:10 CoteDave

Added binary classification support to MAPIE using the mondrian conformal predictor

Description

Type of change

How Has This Been Tested?

Checklist

1. Your PR in a nutshell

2. Our feedback on the PR

3. Additional comments to improve your code

4. Actions to be taken