neural-fortran
neural-fortran copied to clipboard
Weight initialization
Weights Initialization
Added functions for Xavier and Kaiming. The rule of thumb here:
- S-shaped activation (
tanh,sigmoid, etc.) => Xavier - ReLU-shaped activation (
relu,gelu,silu, etc.) => Kaiming
For networks without Layer or Batch Normalization, that simple tweak will significantly increase convergance
Thanks, Michael, this is definitely needed.
About 1.5 years ago I started an Initializers PR (https://github.com/modern-fortran/neural-fortran/pull/151) but forgot about it. Basically it follows a similar pattern to how activations and optimizers are done in NF, which allows complete customization if specified, and sane defaults (like the ones you have here) if unspecified.
Do you think it would work well?
Added it while doing this: https://github.com/OneAdder/neural-fortran/blob/text_classification_example/example/text_classification.f90
@milancurcic Yes, I think #151 will work!