coral-pytorch
coral-pytorch copied to clipboard
Coral_loss importance_weights
Hi,
I have a classification task for a rating system over 5 classes. I want now to switch to ordinal regression but I am a bit lost regarding the importance weights. So for the moment, for example, my weights are, for my 5 classes : tensor([1.4237, 1.0000, 2.4000, 1.2923, 1.1200]) (so my dominant class is 2 and the one with the least examples is the 3) How should I convert it to pass it to importance_weights of coral_loss ?
Regards,
Hi there,
We didn't include the last rank due to redundancy. E.g., if you have the 5 classes 0, 1, 2, 3, 4. The 4 tasks are then p(y > 0), p(y > 1), p(y > 2), p(y > 3). If p(y > 3) is greater than 0.5, we assume it's class 4.
So the weight for the first task p(y > 0) would be for deciding between class 0 and 1. I am not sure whether using
- tensor([1.4237, 1.0000, 2.4000, 1.2923])
- or tensor([1.0000, 2.4000, 1.2923, 1.1200])
would be more appropriate for this situation. Or, maybe the compromise:
tensor([(1.4237+1.0000 / 2), (1.0000+2.4000)/2, (2.4000+1.2923)/2, (1.2923+1.1200)/2])
Hi, I was asking myself something similar. I was thinking that maybe, since all K-1 tasks are binary tasks, the logic could be to cumulate probability as follows λ(0)=(1.0000+ 2.4000+1.2923+1.1200)/1.4237 λ(1)= (2.4000+1.2923+1.1200)/ 1.4237 and so on. Rescaling is needed to account for the first class. I would appreciate any feedback on this.
So the weight for the first task p(y > 0) would be for deciding between class 0 and 1. I am not sure whether using
My logic in the comment above is that the first task would be for deciding between class 0 and any of [1,2,3,4]. Not sure I am correct however 😉.
The compromise Tensor([(1.4237+1.0000) / 2, (1.0000+2.4000) / 2, (2.4000+1.2923) / 2, (1.2923+1.1200) / 2])
makes sense to me, when we think about each output neuron as being responsible to decide between one of two classes.
6/6/23: I catastrophically buried the lede with previous edits, so hopefully someone will see this edit and get what I'm trying to say :)
I think importance weights should be viewed as addition to, rather than a manipulation of, class weights.
Take class imbalance (or any class-based issue to be resolved through balancing). We're not trying to have the model "see more of" a particular level or levels, but rather particular classes - or, more specifically, observations from particular classes. This is an entirely separate issue from comparing class k to class k+1, which is what levels deal with.
The documentation says to use a vector representing levels, but the function doesn't actually check the size of importance_weights
, just multiplies it against the per level per sample loss. This is a good thing, because it allows you to pass a matrix. I recommend passing the outer product of your class and importance weights.