Add RMS Normalization Layer

Open Cydral opened this issue 1 year ago • 0 comments

This PR introduces a new RMS (Root Mean Square) Normalization layer to Dlib. RMS Normalization is a variant of Layer Normalization that has shown promising results in various deep learning tasks, particularly in Natural Language Processing.

Key changes:

Add rms_norm_ class implementing the RMS Normalization layer
Implement rms_normalize and rms_normalize_gradient utility functions
Add CPU and CUDA implementations for RMS Normalization
Include unit tests for the new layer

This new layer provides an alternative to the existing layer_norm_, offering potential performance benefits and improved training stability in certain scenarios.

Usage Example: For a comprehensive example of how to use this new RMS Normalization layer in a Transformer-based architecture, please refer to the ERNIE project: https://github.com/Cydral/ERNIE

Aug 25 '24 15:08 Cydral