continual-learning-baselines icon indicating copy to clipboard operation
continual-learning-baselines copied to clipboard

Generative Replay example using a VAE on SplitMNIST benchmark

Open travela opened this issue 1 year ago • 2 comments

This test reproduces the Generative Replay results of this paper using a VAE as generative model and 100 replay images.

Closes #27.

travela avatar Sep 08 '22 12:09 travela

Thanks @travela ! I have a few questions:

  1. Shouldn't the reference performance be 0.79, given Table 1 in the paper for MNIST dataset?
  2. The paper says they use 50 epochs and lr=2e-4, you are using 10 and 1e-3. Have you tried with their configuration to see the results?
  3. In case the performance is lower, you can try with a large replay size. It seems that in the paper the amount of replay samples is controlled by this parameter which is then used by the GenerativeReplay class. If your performance is lower you can try to augment your replay size (e.g., 150, 200) and see if the gap closes.

AndreaCossu avatar Sep 08 '22 14:09 AndreaCossu

Yes, I guess it should @AndreaCossu. Since the authors also provided a confidence interval of ±4.40%, I figured that I at least reach the lower bound of that interval.

Good points regarding the #epochs and replay size. I tried different variations now (50 epochs (took forever on my local machine), 150 replays as well as both combined), but it didn't make much of a difference. In fact, more replays can deterior the results, as the model might not pay enough attention to the new samples and thus fails to learn and only remembers. I wrote my master thesis on this topic and what helps here, is that we let the number of replay samples grow dynamically (few in the beginning, more later on). However that would alter the "classical" implementation, but it would surely increase the accuracy.

What might be the issue here, is that I employ a simpler VAE model than the authors of the paper do and thus their replay images might be sharper and more balanced.

travela avatar Sep 09 '22 09:09 travela

Thanks @travela (sorry for the delay). I'll merge this PR since the performance is still comparable with the one from the original paper.

AndreaCossu avatar Dec 01 '22 15:12 AndreaCossu