HowToTrainYourMAMLPytorch Questions about mini-imagenet experiments

Questions about mini-imagenet experiments

Open silverbottlep opened this issue 5 years ago • 10 comments

Hi, thanks for sharing this code! It would be really helpful!

I ran 'mini-imagenet_maml_mini-imagenet_5_way_1_shot.json'. It runs 50,000 iterations, each iteration has 4 tasks (e.g. meta-level batch is 4). It spit out test accuracy ~0.48 at the end, which is close to the one reported in the paper. But, I had some questions.

It seems it does model ensemble of top-3 models, which was selected based on validation accuracy. When I only used top-1 model I could only get ~0.46 accuracy. And, Top validation accuracy was ~0.46, so I think boosting up to ~0.48 might be thanks to the ensemble. But is it valid experiment setup?
cnn_num_filters was set to 48, but I believe original MAML used 32 filters. I ran the experiment with 32 filters, and it gave me ~2% less accuracy.
'first_order_to_second_order_epoch' was set to 10, which enables first-order optimization for the first 10 epoch. you mentioned it increased training stability. But, I believe original MAML didn't used this.

It would be great if you could resolve this issue! Thanks!

Dec 22 '18 22:12 silverbottlep

Thanks for your comments.

Since the ensembling setup was used across all experiments in the paper, then it is a valid setup. The reason I used the ensemble is because after trying for a whole month I could not reproduce the MAML results from the original MAML paper. I tested my own code, and then looked to other Pytorch repos, none of which reproduced the results from the paper. I used the ensembling to match the MAML results in the paper. The MAML++ results are always relative to the MAML, and if MAML gets x% at Mini-Imagenet 5-way 1-shot, then MAML++ will get consistently x+3.5% or so. This is what our claim was. That it improves on MAML. Note: The ensembling is not necessary for reproducing the Omniglot results.
I used 48 because I found that the setup converged more consistently to the original MAML papers results.
I mentioned in the paper that first order to second order is mainly for improving the training speed, not for improving stability.

Dec 22 '18 22:12 AntreasAntoniou

Thanks for your prompt reply.

I also ran other pytorch implementation, and none of them (I tried 2 different codes) approached same level of accuracy as reported in original MAML paper (~48%). Yeap, I understand your pain, and but in many cases where meta-learning is usefule, ensemble might not makse sense, e.g. RL applications, sample efficiency is the one of the main benefits of meta-learning. Anyway, I think it'd be better to mention this in the paper.
I could get ~0.47 with 32 filters with ensemble with 3 different models. But, it sereiously overfits to training sets.
Thanks. I also don't think it would make big difference in terms of final accuracy, either.

Dec 22 '18 23:12 silverbottlep

I don't see how my ensemble affects sample efficiency. I train a model for N epochs, then choose the models with the best accuracy across the N epochs. So, in terms of sample efficiency, it's the same as training a non-ensembled model. The difference is the inference time required for each new task seen.

Dec 22 '18 23:12 AntreasAntoniou

yeap, what I meant by was for new task. And, also it seems model-ensemble in RL is not as popular as supervised classification setup.

BTW, do you remember the accuracy without ensemble for 5-shot mini-imagenet task?

Dec 23 '18 01:12 silverbottlep

Yeah, RL model ensemble doesn't work as well in my experience.

Unfortunately, I don't remember then un-ensembled accuracies for the mini-imagenet 5-shot variants.

Dec 23 '18 01:12 AntreasAntoniou

Thanks for your quick response, I do really appreciate it. It would be much appreciated if you could share the results of non-ensemble of both MAML++ and MAML. Anyway, your work would be very helpful for many people, thanks!

Dec 23 '18 01:12 silverbottlep

Ok, I'll run the experiments without any ensembles and let you know.

Dec 23 '18 03:12 AntreasAntoniou

In the orignal MAML paper I think they use 64 filters not 32 for classification tasks

Oct 12 '19 14:10 sudarshan1994

They used 64 for Omniglot and 32 for Mini Imagenet.

Oct 12 '19 16:10 AntreasAntoniou

Yeah, thanks for the correction, I ran this experiment "mini-imagenet_1_2_0.01_48_5_0" and ia m getting a test accuracy of 50.30, is this correct because in the MAML paper they get only around 48.50. Thanks a lot for this implementation, by far the cleanest implementation of MAML in pytorch

Oct 15 '19 01:10 sudarshan1994

HowToTrainYourMAMLPytorch HowToTrainYourMAMLPytorch copied to clipboard

Questions about mini-imagenet experiments

HowToTrainYourMAMLPytorch
HowToTrainYourMAMLPytorch copied to clipboard