agem icon indicating copy to clipboard operation
agem copied to clipboard

Reproducing experiments "On Tiny Episodic Memories in Continual Learning"

Open e7mul opened this issue 3 years ago • 0 comments

Hi,

I tried to reproduce your results that you described in section 5.5. of your paper "On Tiny Episodic Memories in Continual Learning" because I couldn't find an implementation in your codebase and the experiment seem relatively easy to reproduce. I'm mostly interested in the results for 20-degrees rotation, where fine-tuning on the second task does not harm performance on the first one, so in fact I am only interested to reproduce this figure:

Screenshot 2021-04-08 at 15 10 02

I've skimmed the paper and listed the following hyperparameters:

  • MLP with 2 hidden layers, 256 units each, followed by ReLU
  • SGD with lr=0.1
  • CrossEntropy loss
  • Minibatch size=10
  • A single pass through the whole dataset

Unfortunately, after reproducing the experiments I found that after finishing the first task my network has 96% accuracy on test set in contrast to 85% that you reported and finetuining only on the second task indeed leads to catastrophic forgetting (which is not so catastrophic in this case, but leads to the loss of ~5% of accuracy on the test set).

Could you please provide me any details about your experimental setup? Am I missing something?

e7mul avatar Apr 08 '21 13:04 e7mul