fewshot-egnn
fewshot-egnn copied to clipboard
About the evaluation setup
Hello, I'm just a little confused about the evaluation setup.
In Section 4.2 of your paper, it is said that 'For evaluation, each test episode was formed by randomly sampling 15 queries for each of 5 classes, and the performance is averaged over 600 randomly generated episodes from the test set.' I think it means every test episode has (5 * 15 =) 75 queries and 75 graphs are formed from each test episode under the non-transductive setting.
However, according to your released code, it seems that every val/test episode only has (5 * 1 =) 5 queries. And you randomly sample 10,000 episodes for validating/testing.
I'm just wondering which evaluation setup you use when obtaining the results in your paper. And have you tried them both? If so, is there any difference between the results obtained by following these two evaluation setups? Thank you!
HI, I have some question about the evaluation setup too.
-
In your evaluation part, you use the labels of support samples to initialize the edge, however, during testing, we should not get the ground-truth.
-
You use the labels information to initialize the edges during the training and testing phase, and then evaluation all this nodes, is it fair? Thank you!
Same question to @neilfei
same to @neilfei . The 'num_queries =1 ' does not meet the standard few-shot learning settings.