POT
POT copied to clipboard
(Semi-)Supervised Domain Adaptation for regression problem using POT
🚀 Feature
Extension of the methods in ot.da.*
for regression problems (by now only classification (?)).
Motivation
I already used ot.da.SinkhornLpl1Transport
for domain adaptation in (semi-)supervised classification problems (i.e. in ot.da.SinkhornLpl1Transport.fit(Xs, ys, Xt, yt)
, where yt
contains either the class label (a positive scalar) of a sample or -1
if the label is unknown). The only way I found in order to transfer this method to a (metric) regression problem is to convert the regression problem to a classification problem (e.g. by discretising the metric target value y
in e.g. 10 classes). Of course this conversion is not ideal as both the natural order of y
and distances between y
s get lost in a classification problem.
Pitch
Ideally yt
is capable of taking both class labels or metric target values. Samples without a label information are marked with e.g. numpy.nan
instead of -1
. The decision whether it is a regression or a classification problem is either clarified with an additional parameter, e.g. is_cls=True/False
or automatically (harder).
Alternatives
Maybe I am missing something and there is already a possibility for regression problems or it is impossible to implement as OT is not capable of working with yt
s of metric scale.
Additional context
Nothing to add here.
Hello @MrPr3ntice . Indeed this extension could be useful. Please note that nothing prevents it in theory. You can take a look at this repo https://github.com/rflamary/JDOT where we use an OT based strategy to perform DA on a regression task.
Thanks @ncourty for your insights! Thanks also for mentioning JDOT, which I recognized earlier but when I remember correctly, no (semi-)supervised strategies are supplied (i.e. yt
is only used for the validation, not for fitting the data alignment)? I will take a deeper look at the class regularization theory from https://arxiv.org/pdf/1507.00504.pdf (section 4) and maybe will come up with a proposal for the (semi-)supervised regression problem. As you mentioned, there should be nothing preventing this in theory.
Yes ! also you can take a look at https://arxiv.org/pdf/2202.06208.pdf (this is shameless self-promotion, sorry) for a work (under- review) on a specific type of regularizer for DA in the regression setting. Hopefully, when accepted, we will add it to POT.