AdderNet About the local learning rate

In equation 13, how to set the value of \yeta is not clarified. I'm quite confused.

Apr 16 '20 08:04 hmgxr128

I'm also confused about it.

Apr 16 '20 09:04 JamesHujy

eta is set as 0.2, which will be reported in the camera-ready version.

Apr 16 '20 09:04 HantingChen

Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment?

Apr 17 '20 07:04 hmgxr128

Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment?

0.1 and 0

Apr 17 '20 07:04 HantingChen

Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment?

0.1 and 0 Sorry, but T_max is suggested to be an integer representing the maximum number of iterations, should it be 50 (number of epochs)? I guess you mean the initial learning rate is 0.1?

Apr 17 '20 07:04 hmgxr128

Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment?

0.1 and 0 Sorry, but T_max is suggested to be an integer representing the maximum number of iterations, should it be 50 (number of epochs)? I guess you mean the initial learning rate is 0.1?

Sorry for the mistake. T_max is 50 and initial learning rate is 0.1

Apr 17 '20 07:04 HantingChen

@HantingChen if eta is not set as 0.2, for example we set it as 0.1 or 0.4, will the result be quiet different from 0.2?

May 01 '21 15:05 brisker

@HantingChen if eta is not set as 0.2, for example we set it as 0.1 or 0.4, will the result be quiet different from 0.2?

The ablation study can be found in the paper. (Table 4 in https://openaccess.thecvf.com/content_CVPR_2020/papers/Chen_AdderNet_Do_We_Really_Need_Multiplications_in_Deep_Learning_CVPR_2020_paper.pdf)

May 06 '21 02:05 HantingChen

@HantingChen 论文里resnet20-cifar10 的性能是91.84，并且是无任何乘法的，但是我看到你们的代码里，首尾层是普通卷积层，所以如果首尾层也是加法层，resnet20-cifar10的性能也是91.84吗？这个是论文的笔误吗？

May 06 '21 03:05 brisker

@HantingChen 此外，如果是在imagenet和cifar上， eta的影响也是像mnist上一样，只有0.2个点的波动吗？这个你们有对比试验吗？

May 06 '21 06:05 brisker

@HantingChen 论文里resnet20-cifar10 的性能是91.84，并且是无任何乘法的，但是我看到你们的代码里，首尾层是普通卷积层，所以如果首尾层也是加法层，resnet20-cifar10的性能也是91.84吗？这个是论文的笔误吗？

你好，论文里的模型和代码里的一样，论文中的表忽略了首尾层的乘法量（因为其远少于整个模型的计算量，已在论文中说明）

May 17 '21 02:05 HantingChen

@HantingChen 此外，如果是在imagenet和cifar上， eta的影响也是像mnist上一样，只有0.2个点的波动吗？这个你们有对比试验吗？

暂无对比试验

May 17 '21 02:05 HantingChen