dlib
dlib copied to clipboard
Add RMS Normalization Layer
This PR introduces a new RMS (Root Mean Square) Normalization layer to Dlib. RMS Normalization is a variant of Layer Normalization that has shown promising results in various deep learning tasks, particularly in Natural Language Processing.
Key changes:
- Add rms_norm_ class implementing the RMS Normalization layer
- Implement rms_normalize and rms_normalize_gradient utility functions
- Add CPU and CUDA implementations for RMS Normalization
- Include unit tests for the new layer
This new layer provides an alternative to the existing layer_norm_, offering potential performance benefits and improved training stability in certain scenarios.
Usage Example: For a comprehensive example of how to use this new RMS Normalization layer in a Transformer-based architecture, please refer to the ERNIE project: https://github.com/Cydral/ERNIE