PCL icon indicating copy to clipboard operation
PCL copied to clipboard

question about concentration around a prototype

Open vmmm123 opened this issue 3 years ago • 2 comments

In the paper, you have mentioned "With the proposed φ, the similarity in a loose cluster (larger φ) are down-scaled, pulling embeddings closer to the prototype", but i am wondering why the down-scaled similarity can force them get closer? Could you please explain it more detailedly? Thanks!

vmmm123 avatar Jan 07 '22 05:01 vmmm123

Hi, thanks for your question!

The loss function will try to increase the similarity between an embedding v and its positive prototype c: v \dot c / phi. When phi is larger, v \dot c also needs to be larger in order to increase the similarity. Therefore, the embedding becomes closer to the prototype.

LiJunnan1992 avatar Jan 07 '22 06:01 LiJunnan1992

ok, it is a direct thought. I try to understand it from the angle of gradient and i am afraid that the larger gradient may force the model more focus on the tight cluster when / phi is smaller.

vmmm123 avatar Jan 07 '22 07:01 vmmm123