POT icon indicating copy to clipboard operation
POT copied to clipboard

(Semi-)Supervised Domain Adaptation for regression problem using POT

Open MrPr3ntice opened this issue 2 years ago • 3 comments

🚀 Feature

Extension of the methods in ot.da.* for regression problems (by now only classification (?)).

Motivation

I already used ot.da.SinkhornLpl1Transport for domain adaptation in (semi-)supervised classification problems (i.e. in ot.da.SinkhornLpl1Transport.fit(Xs, ys, Xt, yt), where yt contains either the class label (a positive scalar) of a sample or -1 if the label is unknown). The only way I found in order to transfer this method to a (metric) regression problem is to convert the regression problem to a classification problem (e.g. by discretising the metric target value y in e.g. 10 classes). Of course this conversion is not ideal as both the natural order of y and distances between ys get lost in a classification problem.

Pitch

Ideally yt is capable of taking both class labels or metric target values. Samples without a label information are marked with e.g. numpy.nan instead of -1. The decision whether it is a regression or a classification problem is either clarified with an additional parameter, e.g. is_cls=True/False or automatically (harder).

Alternatives

Maybe I am missing something and there is already a possibility for regression problems or it is impossible to implement as OT is not capable of working with yts of metric scale.

Additional context

Nothing to add here.

MrPr3ntice avatar Feb 15 '22 15:02 MrPr3ntice

Hello @MrPr3ntice . Indeed this extension could be useful. Please note that nothing prevents it in theory. You can take a look at this repo https://github.com/rflamary/JDOT where we use an OT based strategy to perform DA on a regression task.

ncourty avatar Feb 15 '22 16:02 ncourty

Thanks @ncourty for your insights! Thanks also for mentioning JDOT, which I recognized earlier but when I remember correctly, no (semi-)supervised strategies are supplied (i.e. yt is only used for the validation, not for fitting the data alignment)? I will take a deeper look at the class regularization theory from https://arxiv.org/pdf/1507.00504.pdf (section 4) and maybe will come up with a proposal for the (semi-)supervised regression problem. As you mentioned, there should be nothing preventing this in theory.

MrPr3ntice avatar Feb 16 '22 09:02 MrPr3ntice

Yes ! also you can take a look at https://arxiv.org/pdf/2202.06208.pdf (this is shameless self-promotion, sorry) for a work (under- review) on a specific type of regularizer for DA in the regression setting. Hopefully, when accepted, we will add it to POT.

ncourty avatar Feb 21 '22 10:02 ncourty