Deep_Metric
Deep_Metric copied to clipboard
Pretrained models
Hi, Thanks for your great work! But it seems that the pre-trained models were removed from the cloud drive. Could you please kindly upload them again ?
Thx. I am sorry the model has been deleted from the cloud drive.
You can download by yourself from here: http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-239d2248.pth
And I will make project more clear in next week.
Thanks a lot! I have found this pre-trained model in https://github.com/Cadene/pretrained-models.pytorch#inception, is this exactly what you use?
Yes! BN-inception
URL: http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-239d2248.pth
Hi @bnulihaixia , I have a small question about Deep Metric Learning, it would be great if you can give me some explanation. I just began reading Deep Metric Learning papers several days ago. I found it very confusing that almost no one in these DML papers uses a classification loss (i.e. softmax cross entropy) to fine-tune in CUB200/Cars196 datasets and serves it as a baseline. In your latest experimental results, the knnSoftmax get a 60+ recall@1 in CUB, I think it's reasonable. However in my experiment(based on your code) with a simple softmax loss, I can also get 60+ result. I am not sure if this result is reasonable since it's too high compared with other DML methods. So I'm wondering whether you have tried softmax loss, and if so what's the approximate performance?
p.s.I found a paper that uses softmax loss as baseline. Indeed they argue that softmax loss outperforms many DML methods. But they only report around 51% recall@1 on CUB.
The paper which uses softmax loss as baseline:https://arxiv.org/pdf/1712.10151v1.pdf
Thanks for your question.
-
I think the introduction of the paper of lifted structure loss have already given the answer.
-
Before reading the paper, I think the softmax loss may have good result on CUB and Car. But could not have very good performance on online - product.
I take a quick look at the paper, and see the thing just happened as I think. This is my point, may be not right.

@bnulihaixia I agree with that. Metric losses do better when training data per class becomes very scarce, and their complexity don't increase w.r.t the number of classes. These are advantages of metric losses.
However, even we know softmax has its drawback, I still think softmax loss should be a baseline, since some metric losses have similar form as softmax[1,2]. Researches can argue that their metric losses perform better than softmax in some scenario, but I don't think it's a good idea to just ignore softmax.
Moreover, the linear complexity of softmax also can be addressed, see [3], it seems that Sensetime trains a very large scale softmax loss (10M+ classes ) with this method and acquires pretty good performance in face recognition task.
Thank you for your time! Zhongdao
[1]Movshovitz-Attias etal. No Fuss Distance Metric Learning using Proxies [2]Meyer etal. Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks [3]Zhang etal. Accelerated Training for Massive Classification via Dynamic Class Selection
I agree with you that softmax loss should be a baseline. [2] is just the same loss as ONCA in [1], and I have pointed it out in my README. The loss can be explained as Classification loss in some way, and also can be interpreted as standard metric loss as a weighted version of triplet loss.
Thanks for providing the paper of [3]. I will read the paper in these days.
[1]Salakhutdinov, R., Hinton, G.: Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure. In: International Conference on Artificial Intelligence and Statistics. (2007) [2]Meyer etal. Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks [3]Zhang etal. Accelerated Training for Massive Classification via Dynamic Class Selection
@Zhongdao The paper which uses softmax loss as baseline:https://arxiv.org/pdf/1712.10151v1.pdf I have read the paper carefully these days. You reach 60+ recall@1 on CUB data, while the paper only reach 51. I think the reason may be you used an embedding dimension of 512 but the paper used an embedding dimension of 64. If you use the same embedding dimension, I think you would get similar performance!
@bnulihaixia You're right, embedding size matters. Actually I did not add an embedding layer and directly fine-tune on top of layer pool5, from my observation this will bring some performance gain.
@Zhongdao I have some question about the detail of your softmax training process,Can I have your telephone Number or we-chat?
@bnulihaixia
Hi haixia, I am very interested in your great work. I am also working on deep metric learning and ReID, is there any chance I could have your we-chat for easy communication?
Thanks so much.
yes, please give me your wechat account.
hi, can i also have your wechat? This is a research intern from Face++. i am also doing research on metric learning!
Please send your wechat account to me, I will add you.