mdetr Difference between 'tokens_positive_eval' and 'tokens

Hi ,

Thanks for the amazing work. May I ask during inference, if there is any fundamental difference in the coco_img['tokens_positive_eval'] and coco_img['tokens_positive_eval'], it looks like they have the same word span but in a different order?

Sep 20 '21 09:09 melongua

Hi @melongua

The difference is as follows:

tokens_positive is the alignment that we use for computing the loss. It works in tandem with the boxes information: it is assumed that len(tokens_positive) == len(boxes), and that the first set of tokens corresponds to the first box and so on.
tokens_positive_eval is used only during Flickr evaluation. It is meant to contain the phrases for which we want ranked boxes. In this case there is no boxes associated in the ground-truth (for eg when we perform evaluation on test). This should be seen as the way to query the model on a given set of phrases.

I hope this clarifies a bit the role, feel free to ask if you have further questions.

Sep 20 '21 15:09 alcinos

Hi @alcinos,

Thank you for your quick response, just to double-check: Is it correct to say that the only difference between them is their association with the ground truth boxes, and the word span information contained in them is the same? Or put it in another way, when evaluated on test split, there is no difference in using tokens_positive or tokens_positive_eval as there is no box association?

Sep 20 '21 19:09 melongua

Difference between 'tokens_positive_eval' and 'tokens_positive'