class-incremental-learning icon indicating copy to clipboard operation
class-incremental-learning copied to clipboard

train on CIFAR100 with --nb_cl=10

Open yhchen12101 opened this issue 5 years ago • 6 comments

Hi, yaoyao. I directly followed the instructions and ran the code "python main.py --nb_cl_fg=50 --nb_cl=10 --nb_protos 20 --resume --imprint_weights".

I expect that the accuracy is over 55%, which is consistent with the results in fig.4(a). But the results was only about 52%. image

The average accuracy I got was about 62% but the results in table 1 is 64.95%.

I an wondering if I got something wrong or they are reasonable results?

Thanks~

yhchen12101 avatar Jun 04 '20 13:06 yhchen12101

Hi @Julie12101,

Thanks for your interest in our work.

I have checked this repository and removed some bugs. The results for the last phase when running python main.py --nb_cl_fg=50 --nb_cl=10 --nb_protos 20 --resume --imprint_weights are as follows,

image

Please try to run the experiments with the updated version.

yaoyao-liu avatar Jun 04 '20 17:06 yaoyao-liu

Hi @yaoyao-liu , Thanks for your sharing and reply. I have run the updated version and the performance can now reach to what your shared. However, I still have a few questions about the code :

  1. It seems that the model will train two feature extractors (free_model, the weight scaling and shifting parameters of tg_model) and use both when testing. I am not sure if I understand this correctly. I think this design is not mentioned in the paper.

  2. In your updated version, the first phase and the rest of the phases have different training epochs (160 and 60). Can you please share the number of epoch used for the ImageNet-Subset experiment?

Thanks in advance!

yhchen12101 avatar Jun 12 '20 05:06 yhchen12101

Hi @Julie12101,

For the scaling and shifting weight, please kindly refer to the Supplementary materials, Section B for details: arXiv link. In our previous experiments, we use a uniform epoch number 160. I found the performance is almost the same so I decrease the number of epochs to 60 as the model is already converged. You may also use 160 uniformly.

I'll update the implementation for ImageNet-Sub and ImageNet later in this repository. I'll keep you informed if it is available.

Best, Yaoyao

yaoyao-liu avatar Jun 12 '20 06:06 yaoyao-liu

Hi @yaoyao-liu ,

Thanks for your prompt and detailed response. I have read the supplementary materials and felt the design is quite interesting. However, the document only mentions how to train the tg_model with weight transfer operations, but without any word of using an additional free_model. To my understanding, in your updated version, there are two feature extractors, free_model (without weight transfer operations) and tg_model (with weight transfer operations). At training, the model fuses these two feature extractors (free_model and tg_model) by process_inputs_fp function and updates both of them. Similarly, at test time, the model will use both of them to get the final prediction. I an wondering if I understand correctly.

Thanks in advance!

yhchen12101 avatar Jun 12 '20 08:06 yhchen12101

Hi @Julie12101,

We update both the free_model and the tg_model during training. It aims to provide different degrees of freedom for model adaptation. You may set --fusion_mode to free to see the ablation results using twice free_model without weight transferring operations. It proves that the improvements come from the weight transferring operations instead of additional model capacity.

For more details on the weight transferring operations, you may also refer to our work, meta-transfer learning. There are detailed analyses and discussions about the technique in the related paper.

Best, Yaoyao

yaoyao-liu avatar Jun 12 '20 08:06 yaoyao-liu

Hi @Julie12101,

According to your additional feedback, we clarify as follows.

  • The improvements are not from the additional model capacity. The ablation results are shown in the table for the setting --nb_cl_fg=50 --nb_cl=10 --nb_protos 20 --imprint_weights. We can observe that using 2x free directly will not improve the results. (Please note that the implementation for the ablation experiment 2x free is already included in the current repository, you may use python main.py --nb_cl 10 --imprint_weights --fusion_mode free to run it.)
Setting Acc. for the last phase Mean Acc.
Ours 58.21% 66.51%
2x free 54.18% 62.26%
Baseline 54.30% 63.17%
  • Using tg_model+free_model is a newly added technique. It achieves further improvements and it is not related to the original paper. We'll also upload the original implementation in this repository later (in another branch).

Thanks for your interest in our paper. Feel free to add comments in this issue or create a new issue for other questions.

yaoyao-liu avatar Jun 16 '20 09:06 yaoyao-liu