CapDec icon indicating copy to clipboard operation
CapDec copied to clipboard

About the variance of 0.016

Open 232525 opened this issue 3 years ago • 3 comments

Hi, authors. In your paper, you mentioned:

Specifically, we set $\epsilon$ to the mean $\mathcal{l}_\infty$ norm of embedding differences between five captions that correspond to the same image. We estimated this based on captions of only 15 MS-COCO images.

I would like to know, have you released the corresponding code detail of the estimate of the variance?

232525 avatar Nov 14 '22 12:11 232525

Hi, Yes. We calculate it using the predictions_runner.py script with the flag --ablation_dist and the flag --text_autoencoder (which specifies to use CLIP text encoder rather than an image encoder. see its main method there: calc_distances_of_ready_embeddings). This will print the results of the average max norm of 900 samples but it will also save the 900 values to a pickle file so you can make sure this estimation could be done also with only 15 samples with high confidence (i.e. the variance of the max norm of different sets is negligible).

DavidHuji avatar Nov 14 '22 12:11 DavidHuji

Thanks for your reply, I calc the average result on val set, and it seems normal. But something confused me: you mentioned $l_\infty$ norm in your paper, but it seems that the $l1$ norm result is adopted. Is there anything I missed or mistake? image

232525 avatar Nov 15 '22 02:11 232525

It prints a few different metrics though the infinity norm is the one that is printed here (line 86).

DavidHuji avatar Nov 15 '22 09:11 DavidHuji