HantingChen

Results 107 comments of HantingChen

暂时没有开源计划,谢谢关注

It is a single param.

加到BN之后

eta is set as 0.2, which will be reported in the camera-ready version.

> > > Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment? 0.1 and 0

> > > > > Thank you, Hanting. And what about the T_max (period) and eta_min (lower bound) in cosine learning rate decay of the MNIST experiment? > > >...

> @HantingChen > if eta is not set as 0.2, for example we set it as 0.1 or 0.4, will the result be quiet different from 0.2? The ablation study...

> @HantingChen > 论文里resnet20-cifar10 的性能是91.84, 并且是无任何乘法的, 但是我看到你们的代码里, 首尾层是普通卷积层, 所以 > 如果首尾层也是加法层,resnet20-cifar10的性能也是91.84吗? 这个是论文的笔误吗? 你好,论文里的模型和代码里的一样,论文中的表忽略了首尾层的乘法量(因为其远少于整个模型的计算量,已在论文中说明)

> @HantingChen > 此外, 如果是在imagenet和cifar上, eta的影响也是像mnist上一样, 只有0.2个点的波动吗? 这个你们有对比试验吗? 暂无对比试验

We add BN layer after adder layer to solve this problem in our paper.