avalanche icon indicating copy to clipboard operation
avalanche copied to clipboard

Evaluate test_stream after every training epoch

Open SinHanYang opened this issue 3 years ago • 0 comments

Discussed in https://github.com/ContinualAI/avalanche/discussions/1111

Originally posted by SinHanYang August 8, 2022 Hi, I tried to evaluate the model on all experiences after every epoch. I did experiment on CIFAR10 dataset, and made scenario from nc_benchmark:

train_data=CIFAR10(args.download_data,train=True,download=True,transform=train_transforms)
test_data=CIFAR10(args.download_data,train=False,download=True,transform=test_transforms)
scenario = nc_benchmark(train_data, test_data, n_experiences=2, shuffle=True, seed=SEED,task_labels=False)

Second, I set eval_every=1 in the strategy:

cl_strategy = EWC(
    model, optim.Adam(model.parameters(), lr=args.lr),
    CrossEntropyLoss(), args.ewc_lambda, args.ewc_mode, decay_factor=args.decay_factor,
    train_epochs=args.epoch, device=device, train_mb_size=args.batch_size, evaluator=eval_plugin,eval_every=1)

And I assigned eval stream in strategy.train function:

res = cl_strategy.train(experience,eval=scenario.test_stream)

However, the logger showed that it was still evaluating training stream because the confusion matrix was tested on 25000 examples. CIFAR10's training dataset has 50000 examples and testing dataset has 10000 examples.

-- Starting eval on experience 0 (Task 0) from train stream --
100%|██████████| 98/98 [01:02<00:00,  1.58it/s]
> Eval on experience 0 (Task 0) from train stream ended.
	Loss_Exp/eval_phase/train_stream/Task000/Exp000 = 2.3020
	Top1_Acc_Exp/eval_phase/train_stream/Task000/Exp000 = 0.0003
-- >> End of eval phase << --
	ConfusionMatrix_Stream/eval_phase/train_stream = 
tensor([[   0,    0,    0,    0,    0,    0,    0,    0,    0,    0],
        [   0,    0,    0,    0,    0,    0,    0,    0,    0,    0],
        [4993,    0,    7,    0,    0,    0,    0,    0,    0,    0],
        [   0,    0,    0,    0,    0,    0,    0,    0,    0,    0],
        [4990,    0,   10,    0,    0,    0,    0,    0,    0,    0],
        [5000,    0,    0,    0,    0,    0,    0,    0,    0,    0],
        [4987,    0,   13,    0,    0,    0,    0,    0,    0,    0],
        [   0,    0,    0,    0,    0,    0,    0,    0,    0,    0],
        [4997,    0,    3,    0,    0,    0,    0,    0,    0,    0],
        [   0,    0,    0,    0,    0,    0,    0,    0,    0,    0]])
	Loss_Stream/eval_phase/train_stream/Task000 = 2.3020
	StreamForgetting/eval_phase/train_stream = 0.0000
	Top1_Acc_Stream/eval_phase/train_stream/Task000 = 0.0003

My question is, how can I test a model for all experiences after every training epoch, and show accuracies for all experiences just like cl_strategy.eval(scenario.test_stream) does:

Computing accuracy on the whole test set
-- >> Start of eval phase << --
-- Starting eval on experience 0 (Task 0) from test stream --
100%|██████████| 20/20 [00:12<00:00,  1.62it/s]
> Eval on experience 0 (Task 0) from test stream ended.
	ExperienceForgetting/eval_phase/test_stream/Task000/Exp000 = 0.5340
	Loss_Exp/eval_phase/test_stream/Task000/Exp000 = 13.0848
	Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0000
-- Starting eval on experience 1 (Task 0) from test stream --
100%|██████████| 20/20 [00:12<00:00,  1.62it/s]
> Eval on experience 1 (Task 0) from test stream ended.
	Loss_Exp/eval_phase/test_stream/Task000/Exp001 = 0.9240
	Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.6240
-- >> End of eval phase << --
	ConfusionMatrix_Stream/eval_phase/test_stream = 
tensor([[777,  77,   0,  85,   0,   0,   0,  39,   0,  22],
        [118, 755,   0,  50,   0,   0,   0,  25,   0,  52],
        [188,  37,   0, 518,   0,   0,   0, 238,   0,  19],
        [ 84,  29,   0, 784,   0,   0,   0,  83,   0,  20],
        [122,  19,   0, 499,   0,   0,   0, 351,   0,   9],
        [ 70,  16,   0, 770,   0,   0,   0, 137,   0,   7],
        [ 39,  20,   0, 769,   0,   0,   0, 150,   0,  22],
        [ 89,  15,   0, 352,   0,   0,   0, 527,   0,  17],
        [635, 212,   0,  72,   0,   0,   0,  20,   0,  61],
        [180, 401,   0,  85,   0,   0,   0,  57,   0, 277]])
	Loss_Stream/eval_phase/test_stream/Task000 = 7.0044
	StreamForgetting/eval_phase/test_stream = 0.5340
	Top1_Acc_Stream/eval_phase/test_stream/Task000 = 0.3120

Thanks.

SinHanYang avatar Aug 30 '22 02:08 SinHanYang