Adaptive activation function

Open KirillZubov opened this issue 2 years ago • 6 comments

https://arxiv.org/abs/1906.01170

Jul 28 '21 10:07 KirillZubov

Also related is their follow up work which uses neuron-wise adaptive activations: https://arxiv.org/abs/1909.12228

Aug 04 '21 14:08 atiyo

Where is the best place for adaptive activations to live? Architectural aspects such as FastDense are in DiffEqFlux as far as I can tell. But does it make sense to add adaptive activations in there if their only use is in PINNs?

Aug 04 '21 15:08 atiyo

This PINN-only stuff can make sense here, like https://github.com/SciML/NeuralPDE.jl/pull/336 . We need to expand the docs to include a section for it.

Aug 04 '21 22:08 ChrisRackauckas

https://arxiv.org/abs/2006.09661 this is also relevant.

Sep 08 '21 05:09 killah-t-cell

Where is the best place for adaptive activations to live? Architectural aspects such as FastDense are in DiffEqFlux as far as I can tell. But does it make sense to add adaptive activations in there if their only use is in PINNs?

not just use in PINNs, Adaptive activation function's core idea is adding the loss function

to ensure the minimum $S(a)$, that is, the maximum $a$

to increase the gradient

to increase speed， avoid “Gradient disappearance”

so, It is suitable for all neural network related optimization problems

Dec 14 '21 10:12 NeuralPDE

It looks like they used a lot of effort to do very little, i.e., make the activation function have a large enough derivative. In fact simply using a Gaussian activation function has the same effect.

Aug 21 '22 01:08 YichengDWu

NeuralPDE.jl NeuralPDE.jl copied to clipboard

Adaptive activation function

not just use in PINNs, Adaptive activation function's core idea is adding the loss function

to ensure the minimum $S(a)$, that is, the maximum $a$

to increase the gradient

to increase speed， avoid “Gradient disappearance”

so, It is suitable for all neural network related optimization problems

NeuralPDE.jl
NeuralPDE.jl copied to clipboard