NNlib.jl icon indicating copy to clipboard operation
NNlib.jl copied to clipboard

Generalize activation functions to complex input

Open PhilipVinc opened this issue 6 years ago • 3 comments

Hi,

I would find it useful if the activation functions where generalised to handle complex inputs. Many of them are still well-defined in this case, for example, there is no reason for sigmoid to be limited to real values, and it could be easily generalized.

σ(x::Real) = one(x) / (one(x) + exp(-x))
swish(x::Real) = x * σ(x)

softplus is also mathematically well defined, but for an efficient implementation someone with more experience than me could comment on how to rewrite it

softplus(x::Real) = ifelse(x > 0, x + log1p(exp(-x)), log1p(exp(x)))

The main advantage for this is that people like me working with complex-valued neural networks (often encountered in physics) could depend on NNlib and get the GPU versions of those functions with no effort.

Would you accept a PR (and a complimentary PR to CuArrays.jl) for that?

PhilipVinc avatar Sep 14 '19 11:09 PhilipVinc

I don't think there's any good reason for this beyond us not thinking of it at the time. So please do send a PR!

MikeInnes avatar Sep 17 '19 15:09 MikeInnes

Great, thanks!

I noticed that in #118 (@devmotion) you were considering using the implementations of some activation functions from StatsFun . As the PR is somewhat stale, is it still under consideration, or I can safely make a PR to NNlib?

PhilipVinc avatar Sep 18 '19 06:09 PhilipVinc

Yes, it's unfortunate that PR got missed, but since it needs a rebase anyway you may as well just PR against master.

MikeInnes avatar Nov 04 '19 14:11 MikeInnes