nmaac comments

Results 28 comments of


                                            nmaac

r是channel放缩系数，可以根据你的需要调整，对准确率影响不大。这是个很常用的减少参数量的技巧，早在2016年的HyperNetworks就已经是常规做法。这里可以引用原文的解释来回答：“a one-layered hypernetwork would have Nz × Nin × fsize × Nout × fsize learnable parameters which is usually much bigger than a two-layered hypernetwork does.”

good job！i have a question

@jinfagang It depends on hardware platform, normally 10%-20% latency increment.

good job！i have a question

@jinfagang But ACON is a good choice which has the same speed with Swish, and they have the same speed with ReLU if using hard-sigmoid to implement :)

good job！i have a question

@jinfagang I suggest ACON-C which improves the performance with a negligible overhead and shows a good accuracy-speed tradeoff.

good job！i have a question

@yxNONG MetaAcon uses a small network to generate beta, in this work we try some network examples which show sigmoid has good performance. More choices and designs of this small...

Use it in Conv1d

Hi @feizhaixiaomimei , since acon is a general form, you can use it in any networks by simply replacing ReLU.

请问ACON能用在全连接吗，怎么用

可以用在全连接层，比如你的tensor shape是 (batch, width)，你可以简单修改 https://github.com/nmaac/acon/blob/main/acon.py#L13-15 为 self.p1 = nn.Parameter(torch.randn(1, width)) self.p2 = nn.Parameter(torch.randn(1, width)) self.beta = nn.Parameter(torch.ones(1, width))