VanillaNet
VanillaNet copied to clipboard
Questions about hyperparameter λ
Hi! Your work is fantastic! I have two questions regarding the hyperparameter λ:
According to Formula 1, λ should start at 0 and end at 1 during the training process. However, in the "Training Details" section (the final section of the paper), it says that "The λ in Equ. 1 is linearly decayed from 1 to 0 on epoch 0 and 100". Which explanation is correct?
When λ approaches 0, did you use ‘torch.nn.functional.leaky_relu(x, 0)’ to represent y=x? This is a brilliant idea, but it's hard to notice.
Sorry for the mistake. The λ in Equ. 1 should be linearly increased from 0 to 1 on epoch 0 and 100. We will fix this typo. Besides, we do use ‘torch.nn.functional.leaky_relu(x, 0)’ to represent y=x, we will add remark to make the code easy to read. Thanks for the suggestions!