arcface-pytorch icon indicating copy to clipboard operation
arcface-pytorch copied to clipboard

What does eazy_margin in models.metrics.ArcMarginProduct mean?

Open tengjn opened this issue 4 years ago • 11 comments

I don't get the lines below:
if self.easy_margin: phi = torch.where(cosine > 0, phi, cosine) else: phi = torch.where(cosine > self.th, phi, cosine - self.mm) in which, self.mm = math.sin(math.pi - m) *m . What does mm mean?

BTW, I don't think the condition "cosine > 0" equals to the target position. The implementation seems to be different from the paper.

tengjn avatar Oct 21 '19 03:10 tengjn

Hi guy. I can only explain the 'else'. When it reach this 'else', then we can't calculate cos(theta+m) directly due to theta + m > pi, so here the self.mm actes as the cos(theta + m)'s one order Tayler extension to approximate.

ChiSuWq avatar Oct 28 '19 06:10 ChiSuWq

Hi guy. I can only explain the 'else'. When it reach this 'else', then we can't calculate cos(theta+m) directly due to theta + m > pi, so here the self.mm actes as the cos(theta + m)'s one order Tayler extension to approximate.

Hi, thanks for your reply. But I still don't understand. One order Taylor extension of cos(theta + m) should be '' cos(m) - (theta - m)*sin(theta + m) '', which is much different from self.mm.

tengjn avatar Oct 28 '19 08:10 tengjn

Hi, actually, we do one order Taylor extension from theta as the start point. Then, the formula goes to 'cos(θ) - m * sin(θ + m)' if I do not miswrite it.

ChiSuWq avatar Oct 28 '19 08:10 ChiSuWq

image In your case, you regard m as variable and theta as constant. But I think it's opposite. The m should be constant, which is 'a' in the figure.

tengjn avatar Oct 28 '19 09:10 tengjn

Actually, you are maybe wrong. The 'm' changes θ to θ+m, and equally in the Tayler extension, θ is the start point. So in your picture, 'x' equals θ+m while ‘a’ is same as θ. In other words, The standard Taylor extension is f(x + Δx) and m is Δx here.

ChiSuWq avatar Oct 28 '19 09:10 ChiSuWq

In my opinion, when easy_margin is True, margin is added only when θ < 90, which means the arcface only works when model is roughly trained.

SJHNJU avatar May 04 '20 13:05 SJHNJU

@ChiSuWq so, for the else branch, we need to deal with (theta + m) > math.pi. if cos(theta) > cos(math.pi - m) means theta + m < math.pi, so phi = phi; else means theta + m >= math.pi, we use Talyer extension to approximate the cos(theta + m). if fact, cos(theta + m) = cos(theta) - m * sin(theta) >= cos(theta) - m * sin(math.pi - m)

Ontheway361 avatar Jul 26 '20 13:07 Ontheway361

Any body wonders why the original implementation in MXNet just used a mx.sym.cos(), not giving consideration to (theta + m) > pi?

doitslow avatar Sep 17 '20 13:09 doitslow

@ChiSuWq so, for the else branch, we need to deal with (theta + m) > math.pi. if cos(theta) > cos(math.pi - m) means theta + m < math.pi, so phi = phi; else means theta + m >= math.pi, we use Talyer extension to approximate the cos(theta + m). if fact, cos(theta + m) = cos(theta) - m * sin(theta) >= cos(theta) - m * sin(math.pi - m)

So What happens when I replace m+theta >Pi equal Pi and in this code I replace this condition by cos(m+theta) = cos(Pi) ?

tks1998 avatar Aug 06 '21 17:08 tks1998

It seems to me that the term m*sin(pi - m) has no meaning other than to make the similarity function monotonically decreasing. Figure 1 illustrates the function f=cos(\theta) m * sin(\pi - m). Although no proof has been provided, it appears that this function is monotonically decreasing for theta=0.1, 0.01, 0.001.

A simpler realization of the same could be f=cos(theta) - (1 + cos(pi - m)) (Figure 2). In this case, the function is continuous at theta=pi - m.

The simulation code is here: https://www.kaggle.com/code/tatamikenn/arcface-visualize-easy-margin?scriptVersionId=96805831

Fig1. Screen-Shot-2022-05-28-at-15-11-30

Fig2. Screen-Shot-2022-05-28-at-15-16-26

bilzard avatar May 28 '22 06:05 bilzard

您好,我已经收到您的邮件,谢谢。

Ontheway361 avatar May 28 '22 06:05 Ontheway361