Question about L_ort
As I know, cos(X, Y)=0 means X and Y are orthogonal, meanwhile the minimum value of cos(X, Y)=-1 means X and Y are opposite. So, as written in your manuscript, directly minimizing cos function makes two inputs becoming opposite instead of orthogonal.
I notice that you use CosineEmbeddingLoss provided by PyTorch and target is set to -1. So, the actual formula should be L_ort=max(0, cos(X,Y)) (I use X and Y here for ease). I still have two questions:
- The actual loss is not consistent as defined in your manuscript;
- When cos(X, Y) is a negative value, L_ort=0 and has no gradient, so it can only ensure that X and Y are opposite.
Am I misunderstanding something? Hope for your reply.
maybe you can set the loss to |cos(X,Y)|
maybe you can set the loss to |cos(X,Y)|
Correct. I know there are multiple ways to fix this issue. My main concern is the inconsistency between the code and the paper description.