coral-pytorch icon indicating copy to clipboard operation
coral-pytorch copied to clipboard

Coral_loss importance_weights

Open DbrRoxane opened this issue 3 years ago • 5 comments

Hi,

I have a classification task for a rating system over 5 classes. I want now to switch to ordinal regression but I am a bit lost regarding the importance weights. So for the moment, for example, my weights are, for my 5 classes : tensor([1.4237, 1.0000, 2.4000, 1.2923, 1.1200]) (so my dominant class is 2 and the one with the least examples is the 3) How should I convert it to pass it to importance_weights of coral_loss ?

Regards,

DbrRoxane avatar Apr 13 '21 12:04 DbrRoxane

Hi there,

We didn't include the last rank due to redundancy. E.g., if you have the 5 classes 0, 1, 2, 3, 4. The 4 tasks are then p(y > 0), p(y > 1), p(y > 2), p(y > 3). If p(y > 3) is greater than 0.5, we assume it's class 4.

So the weight for the first task p(y > 0) would be for deciding between class 0 and 1. I am not sure whether using

  • tensor([1.4237, 1.0000, 2.4000, 1.2923])
  • or tensor([1.0000, 2.4000, 1.2923, 1.1200])

would be more appropriate for this situation. Or, maybe the compromise:

tensor([(1.4237+1.0000 / 2), (1.0000+2.4000)/2, (2.4000+1.2923)/2, (1.2923+1.1200)/2])

rasbt avatar Apr 14 '21 13:04 rasbt

Hi, I was asking myself something similar. I was thinking that maybe, since all K-1 tasks are binary tasks, the logic could be to cumulate probability as follows λ(0)=(1.0000+ 2.4000+1.2923+1.1200)/1.4237 λ(1)= (2.4000+1.2923+1.1200)/ 1.4237 and so on. Rescaling is needed to account for the first class. I would appreciate any feedback on this.

pmorerio avatar Apr 15 '21 13:04 pmorerio

So the weight for the first task p(y > 0) would be for deciding between class 0 and 1. I am not sure whether using

My logic in the comment above is that the first task would be for deciding between class 0 and any of [1,2,3,4]. Not sure I am correct however 😉.

pmorerio avatar Apr 15 '21 14:04 pmorerio

The compromise Tensor([(1.4237+1.0000) / 2, (1.0000+2.4000) / 2, (2.4000+1.2923) / 2, (1.2923+1.1200) / 2]) makes sense to me, when we think about each output neuron as being responsible to decide between one of two classes.

ananiask8 avatar Aug 17 '21 07:08 ananiask8

6/6/23: I catastrophically buried the lede with previous edits, so hopefully someone will see this edit and get what I'm trying to say :)

I think importance weights should be viewed as addition to, rather than a manipulation of, class weights.

Take class imbalance (or any class-based issue to be resolved through balancing). We're not trying to have the model "see more of" a particular level or levels, but rather particular classes - or, more specifically, observations from particular classes. This is an entirely separate issue from comparing class k to class k+1, which is what levels deal with.

The documentation says to use a vector representing levels, but the function doesn't actually check the size of importance_weights, just multiplies it against the per level per sample loss. This is a good thing, because it allows you to pass a matrix. I recommend passing the outer product of your class and importance weights.

chrico-bu-uab avatar Aug 14 '22 19:08 chrico-bu-uab