VanillaNet icon indicating copy to clipboard operation
VanillaNet copied to clipboard

Questions about hyperparameter λ

Open abcsimple opened this issue 1 year ago • 1 comments

Hi! Your work is fantastic! I have two questions regarding the hyperparameter λ:

According to Formula 1, λ should start at 0 and end at 1 during the training process. However, in the "Training Details" section (the final section of the paper), it says that "The λ in Equ. 1 is linearly decayed from 1 to 0 on epoch 0 and 100". Which explanation is correct?

When λ approaches 0, did you use ‘torch.nn.functional.leaky_relu(x, 0)’ to represent y=x? This is a brilliant idea, but it's hard to notice.

abcsimple avatar Jun 07 '23 09:06 abcsimple

Sorry for the mistake. The λ in Equ. 1 should be linearly increased from 0 to 1 on epoch 0 and 100. We will fix this typo. Besides, we do use ‘torch.nn.functional.leaky_relu(x, 0)’ to represent y=x, we will add remark to make the code easy to read. Thanks for the suggestions!

HantingChen avatar Jun 07 '23 10:06 HantingChen