dinov2 Question about instance recognition metrics

Question about instance recognition metrics

Open strbcks opened this issue 1 year ago • 0 comments

Hi,

In Table 9: Evaluation of frozen features on instance-level recognition. of the table, it shows the performance for OpenCLIP-G/14 is 50.7 for Oxford-M and 19.7 for Oxford-H. However, we only get 39.4 for Oxford-M and 11.7 for Oxford-H (even without 1M distractors) using the evaluation code https://github.com/filipradenovic/revisitop/blob/master/python/evaluate.py#L39

Also tried revisit-oxford (without 1M distractors) the Dinov2-B14 distilled backbone with make_classification_eval_transform() transform in this repo, the metrics I get is 0.58 for Oxford-M and 0.337 for Oxford-H, which seems much lower than the number reported in the paper 0.729 for Oxford-M and 0.495 for Oxford-H.

If possible, could you help clarify:

what metrics you are reporting in the paper, is it mean average precision or mean precision at kappas?
Are you including the 1M distractors in the eval?
what transform I should use the released backbone?

Similar for met, we also cannot reproduce the eval metrics for both OpenCLIP-G/14 and Dinov2-B14.

It will be great if you could provide the code to run on eval sets or the embedding generated!

Thanks!

Apr 21 '23 16:04 strbcks

dinov2 dinov2 copied to clipboard

Question about instance recognition metrics

dinov2
dinov2 copied to clipboard