hpatches-benchmark icon indicating copy to clipboard operation
hpatches-benchmark copied to clipboard

Possible duplication in the definition file patches?

Open InnovArul opened this issue 6 years ago • 3 comments

Hi,

Thanks for collecting the huge dataset and providing an evaluation benchmark!

There seem to be some duplications of patch combinations in the definition files. For example, verif_pos_split-illum definition file contains ~0.36 M duplicated entries, verif_neg_intra_split-illum definition file contains 763 duplicated entries and verif_neg_inter_split-illum definition file contains 9 duplicated entries.

If we consider this evaluation with & without duplications, the resulting numbers (FPR95, PR curve) would differ. Are such settings intentional? If not, is it possible to correct the definition files?

Thanks!

InnovArul avatar Jan 13 '18 17:01 InnovArul

I also faced this. Is there any update on this?

bprashanth14 avatar Jan 26 '18 14:01 bprashanth14

Hi @InnovArul ,

thanks for investigating this, I remember we checked for duplicates before we saved the final tasks, but it is possible that there was a bug somewhere.

Did you see duplicated items in all the splits, or just the illum one?

I suspect this doesnt change rankings, except if any descriptor is specifically sensitive to those duplicated patches (which is unlikely) but in any case we will try to correct this.

vbalnt avatar Jan 26 '18 15:01 vbalnt

Hi @vbalnt ,

Thanks for the reply. So far, I have only checked illum one.

I am not sure if it will not change the rankings. For example, it may be possible for certain descriptors to specifically perform well on those duplicated patches in which case their numbers will go up, but if the other descriptors don't perform well on those duplicated patches, their numbers will be affected more. It might just be an unfair estimate.

Thanks for your effort to correct this.

InnovArul avatar Jan 26 '18 22:01 InnovArul