neural-fortran icon indicating copy to clipboard operation
neural-fortran copied to clipboard

Weight initialization

Open OneAdder opened this issue 9 months ago • 3 comments

Weights Initialization

Added functions for Xavier and Kaiming. The rule of thumb here:

  • S-shaped activation (tanh, sigmoid, etc.) => Xavier
  • ReLU-shaped activation (relu, gelu, silu, etc.) => Kaiming

For networks without Layer or Batch Normalization, that simple tweak will significantly increase convergance

OneAdder avatar Feb 17 '25 19:02 OneAdder

Thanks, Michael, this is definitely needed.

About 1.5 years ago I started an Initializers PR (https://github.com/modern-fortran/neural-fortran/pull/151) but forgot about it. Basically it follows a similar pattern to how activations and optimizers are done in NF, which allows complete customization if specified, and sane defaults (like the ones you have here) if unspecified.

Do you think it would work well?

milancurcic avatar Feb 17 '25 19:02 milancurcic

Added it while doing this: https://github.com/OneAdder/neural-fortran/blob/text_classification_example/example/text_classification.f90

OneAdder avatar Feb 17 '25 19:02 OneAdder

@milancurcic Yes, I think #151 will work!

OneAdder avatar Feb 17 '25 20:02 OneAdder