End-to-End-Incremental-Learning icon indicating copy to clipboard operation
End-to-End-Incremental-Learning copied to clipboard

Performance

Open PatrickZH opened this issue 6 years ago • 14 comments

I cannot get the performance reported in the paper. The first model trained from stratch can only get averagely around 85% acc. when the incremental step is 10, where the reported value is near 90% in the paper. The performance of incremental learning is even worse. Based on my experience, the performance is sensitive to the parameters, e.g., the learning rate and augmentation strategies. If you find any mistakes or have suggestions, feel free to contact me. Thank you!

PatrickZH avatar Feb 15 '19 02:02 PatrickZH

I think that the temperature T you've set and the distillation loss are different for the paper's,and the paper reports that when train a novel_class ,the Net will creat a new classification layer(CLi blocks in F1 from the paper),your code just use the same net(resnet) and train the parameters but don't add a new classification layer. it's my opinion, maybe wrong,hah

wahahaLI avatar Mar 06 '19 08:03 wahahaLI

Actually, I have tried different temperature T. I tried to mute those classifiers (for classes haven't been learned), or dynamically add new classifiers to the network. But, the performance is still low.

PatrickZH avatar Mar 07 '19 01:03 PatrickZH

Maybe you just change the value of T?,but the formua about the T in the paper is that raising pi and qi to the exponent 1/T,not just pi/T.And distillation loss in the paper is computed by pi and qi(modified versions), qi is the ground truth,but in your code,I find that ,the distillation loss is use the old classification logits and the new classification logits.,do you think it is the point?

wahahaLI avatar Mar 07 '19 01:03 wahahaLI

Thank you! I think qi should be the old logits, even the description about qi is confusing in the paper. Assume qi is the ground truth, i.e., onehot labels, how can the knowledge transfer from the old model to the new model?

PatrickZH avatar Mar 08 '19 07:03 PatrickZH

Maybe we can discuss it by email: bozhaonanjing @ gmail

PatrickZH avatar Mar 11 '19 08:03 PatrickZH

Just wanted to know the current performance. @PatrickZH

hardik2396 avatar Aug 03 '19 13:08 hardik2396

Hi. I did some work and found some training tricks. But, recently, I am busy with something else. Maybe, I will update these things a few weeks later. Keep in contact!

Best regards, Bo Zhao

Hardik Chauhan [email protected] 于2019年8月3日周六 下午2:48写道:

Just wanted to know the current performance. @PatrickZH https://github.com/PatrickZH

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PatrickZH/End-to-End-Incremental-Learning/issues/1?email_source=notifications&email_token=ACNYTMREM4XFOQWF6OZ44J3QCWECRA5CNFSM4GXTIAK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3POZGA#issuecomment-517926040, or mute the thread https://github.com/notifications/unsubscribe-auth/ACNYTMXH37R3ZBMMKDJZVYLQCWECRANCNFSM4GXTIAKQ .

PatrickZH avatar Aug 12 '19 20:08 PatrickZH

Could you please update the code? Make it up to the level of the paper report

userDJX avatar Nov 05 '19 02:11 userDJX

I am busy with a new Continual Learning task now. I may update it after finishing that task. It may be 3 weeks later. Sorry for that.

PatrickZH avatar Nov 05 '19 11:11 PatrickZH

Thanks for your contribution! Looking forward to your update!

LesiChou avatar Nov 15 '19 11:11 LesiChou

Hi, does anybody know the definition of image in this paper's loss formula?

hust-nj avatar May 24 '20 02:05 hust-nj

@PatrickZH Thanks for sharing code with us. I have several questions. Is pi in distillation the ground truth label or probability before update the weights? Do you achieve the same accuracy reported in paper?

hust-nj avatar May 24 '20 02:05 hust-nj

  1. pdistij is the ground truth produced by the old model, while qdistij is produced by the new model.
  2. Not yet.

Best regards, Bo Zhao

On Sun, May 24, 2020 at 3:37 AM ninja [email protected] wrote:

@PatrickZH https://github.com/PatrickZH Thanks for sharing code with us. I have several questions. Is qi in distillation the ground truth label or probability before update the weights? Do you achieve the same accuracy reported in paper?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PatrickZH/End-to-End-Incremental-Learning/issues/1#issuecomment-633168288, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNYTMU2UGTORXJSFHSXOKLRTCB7FANCNFSM4GXTIAKQ .

PatrickZH avatar May 25 '20 19:05 PatrickZH

Hi @PatrickZH, thanks for your great work on this. I'm curious about the way you alternate net and net_old variable in the process of forwarding input and passing it to the optimizer as following. Could you please explain it more? image

nttung1110 avatar Mar 22 '22 04:03 nttung1110