GELUs
GELUs copied to clipboard
A smoother activation function (undergrad code)
Gaussian Error Linear Units (GELUs)
This software allows users to reproduce the results in Gaussian Error Linear Units (GELUs), Dan Hendrycks and Kevin Gimpel 2016.
GELU Approximations
The sigmoid(1.702 * x) * x approximation is fast but is somewhat inaccurate. Meanwhile 0.5 * x * (1 + tanh(x * 0.7978845608 * (1 + 0.044715 * x * x))) is slower but more accurate.
However, exact versions are now available in pytorch, so approximations are no longer necessary for suitable speed.
Execution
Please install Tensorflow, Lasagne, and Python 3+.
Citation
If you find this useful in your research, please consider citing:
@article{hendrycks2016gelu,
title={Gaussian Error Linear Units (GELUs)},
author={Hendrycks, Dan and Gimpel, Kevin},
journal={arXiv preprint arXiv:1606.08415},
year={2016}
}