agem Reproducing experiments "On Tiny Episodic Memories in Continual Learning"

Reproducing experiments "On Tiny Episodic Memories in Continual Learning"

Open e7mul opened this issue 3 years ago • 0 comments

Hi,

I tried to reproduce your results that you described in section 5.5. of your paper "On Tiny Episodic Memories in Continual Learning" because I couldn't find an implementation in your codebase and the experiment seem relatively easy to reproduce. I'm mostly interested in the results for 20-degrees rotation, where fine-tuning on the second task does not harm performance on the first one, so in fact I am only interested to reproduce this figure:

I've skimmed the paper and listed the following hyperparameters:

MLP with 2 hidden layers, 256 units each, followed by ReLU
SGD with lr=0.1
CrossEntropy loss
Minibatch size=10
A single pass through the whole dataset

Unfortunately, after reproducing the experiments I found that after finishing the first task my network has 96% accuracy on test set in contrast to 85% that you reported and finetuining only on the second task indeed leads to catastrophic forgetting (which is not so catastrophic in this case, but leads to the loss of ~5% of accuracy on the test set).

Could you please provide me any details about your experimental setup? Am I missing something?

Apr 08 '21 13:04 e7mul

agem agem copied to clipboard

Reproducing experiments "On Tiny Episodic Memories in Continual Learning"

agem
agem copied to clipboard