fewshot-egnn icon indicating copy to clipboard operation
fewshot-egnn copied to clipboard

About the evaluation setup

Open neilfei opened this issue 5 years ago • 3 comments

Hello, I'm just a little confused about the evaluation setup.

In Section 4.2 of your paper, it is said that 'For evaluation, each test episode was formed by randomly sampling 15 queries for each of 5 classes, and the performance is averaged over 600 randomly generated episodes from the test set.' I think it means every test episode has (5 * 15 =) 75 queries and 75 graphs are formed from each test episode under the non-transductive setting.

However, according to your released code, it seems that every val/test episode only has (5 * 1 =) 5 queries. And you randomly sample 10,000 episodes for validating/testing.

I'm just wondering which evaluation setup you use when obtaining the results in your paper. And have you tried them both? If so, is there any difference between the results obtained by following these two evaluation setups? Thank you!

neilfei avatar Aug 21 '19 15:08 neilfei

HI, I have some question about the evaluation setup too.

  1. In your evaluation part, you use the labels of support samples to initialize the edge, however, during testing, we should not get the ground-truth.

  2. You use the labels information to initialize the edges during the training and testing phase, and then evaluation all this nodes, is it fair? Thank you!

FengJH14 avatar Aug 22 '19 03:08 FengJH14

Same question to @neilfei

kikyou123 avatar Aug 24 '19 10:08 kikyou123

same to @neilfei . The 'num_queries =1 ' does not meet the standard few-shot learning settings.

shi1997Yee avatar Nov 20 '20 07:11 shi1997Yee