Some question about the zero-shot eval in lLVIS.

Open MengTanOwn opened this issue 1 year ago • 2 comments

The paper compares the AP index on LVIS when prompts are 1, 4, 16, 32, and 64. However, there are some categories in LVIS that do not have enough images. How is this situation handled?

Dec 04 '24 02:12 MengTanOwn

Hi @MengTanOwn It does happen. In this case, we will take as many samples as possible to generate the visual prompt. For example, for class A, there are only 5 samples in LVIS, so we will only take those 5 samples for that class.

Dec 05 '24 00:12 Mountchicken

Thanks for your answer. Are the visual prompt extracted from the training set or the validation set for LVIS zero-shot?

Dec 05 '24 03:12 MengTanOwn