tensorflow-sphereface-asoftmax what is the meaning of 'f' and 'ff' ? 0.5???

what is the meaning of 'f' and 'ff' ? 0.5???

Feb 19 '19 03:02 henbucuoshanghai

Hello @henbucuoshanghai , If you know the answer, please tell me

Apr 16 '19 02:04 qiyang77

total loss = (λ/(1+λ))softmax loss +(1/(1+λ)) a-softmax loss i) 左边是原始softmax loss. 在训练初期这个λ值很大, 因此原始softmax Loss占主要贡献. ii) λ是和iter有关的(递减), 因此后期λ值会变小(clip到5). 因此后期网络的L-Softmax Loss会占主导.

Apr 25 '19 15:04 sycophant-stone

厉害。、、为什么需要原始的softmax

Apr 28 '19 03:04 henbucuoshanghai

可以去看下large softmax loss 论文.Large-Margin Softmax Loss for Convolutional Neural Networks
简单讲, L-softmax 不好收敛. 开头先用softmax loss. 然后随着epoch增加,逐步增大L-softmax loss的作用.
引用:

For optimization, normally the stochastic gradient descent will work well. However, when training data has too many subjects (such as CASIA-WebFace
dataset), the convergence of L-Softmax will be more difficult than softmax loss.

Apr 28 '19 13:04 sycophant-stone

明白，厉害。需要研究好深。。

Apr 29 '19 10:04 henbucuoshanghai

@sycophant-stone 我注意到在Loss_ASoftmax.py中，最后返回的logits是原始logits，而不是updated_logits。实际上如果返回updated_logits和其他几位实现的一样的话，在计算每一个batch的分类正确率时很不正常...那到底应该返回哪一个呢？

May 27 '19 03:05 LuisKay

@sycophant-stone 我注意到在Loss_ASoftmax.py中，最后返回的logits是原始logits，而不是updated_logits。实际上如果返回updated_logits和其他几位实现的一样的话，在计算每一个batch的分类正确率时很不正常...那到底应该返回哪一个呢？

我在另一个代码实现中看到他返回的是updated_logits，但是updated_logits必须喂入标签数据，我如果想单纯进行预测，不知道标签数据该怎么办。我试了试将另一篇的代码改成返回原始logits，结果准确率远远低于返回updated_logits

Jul 30 '19 01:07 jiazhen-code