torchmd-net icon indicating copy to clipboard operation
torchmd-net copied to clipboard

Add GLU, Swish, Mish and SwiGLU

Open RaulPPelaez opened this issue 7 months ago • 0 comments

  • [x] Added Swish and Mish as possible activation functions.
  • [x] Added a Gated Linear Unit (GLU) module ( eq. 1 here https://arxiv.org/pdf/1612.08083 ), but the activation function is a parameter.
  • [x] Added the SwiGLU activation function https://arxiv.org/pdf/2002.05202

SwiGLU has learnable parameters and requires knowing the input/output shapes at creation, unlike the rest of the activation functions (e.g. SiLU). So, it won't be able to be selected as an activation function in a yaml file right now.

I added it as a tool for NNP development.

Mish and Swish can be selected via a yaml file.

RaulPPelaez avatar Jul 09 '24 12:07 RaulPPelaez