loss-correction icon indicating copy to clipboard operation
loss-correction copied to clipboard

matrix multiplication order question

Open tc64 opened this issue 5 years ago • 3 comments

Hi, thanks for sharing your code! I read your paper and am wondering about the matrix multiplication order for the backward loss correction approach.

The paper says T^{-1} loss

In loss.robust, for backward, we have:

return -K.sum(K.dot(y_true, P_inv) * K.log(y_pred), axis=-1)

It looks to me like the order of matrix multiplication for P_inv and y_true should be switched. My guess is that I'm misunderstanding something, but would really appreciate if you could clarify.

Thanks!

tc64 avatar Apr 19 '19 22:04 tc64

Have you tried with a simple example and check what changes if you switch the order?

giorgiop avatar May 02 '19 19:05 giorgiop

Hi, I'm confused about the calculation about the forward loss. In your paper, the forward loss should be shown as follows. image

However, in the code, it's calculated as: return -K.sum(y_true * K.log(K.dot(y_pred, P)), axis=-1). Why not use P.T instead of P.

rosefun avatar Jun 10 '19 13:06 rosefun

Hi! @tc64 @rosefun @giorgiop I think it may be because y_pred shape is N(batch_size) * C(class_num). That is y_pred = [f(x1)^T;f(x2)^T;...] where f(x1) ( a column vector) is the classifier's prediction on exmaple x1. Thus (P.T * f(x))^T = f(x)^T*P, so it's K.dot( y_pred, P)

And maybe size of y_true is also N(batch_size) * C(class_num), so it's K.dot(y_true, P_inv)

guixianjin avatar Oct 21 '19 01:10 guixianjin