kat icon indicating copy to clipboard operation
kat copied to clipboard

torchortho

Open K-H-Ismail opened this issue 10 months ago • 1 comments

KAT used the variance preserving initialization as formulated in the Kaimining initialization for learnable rational activations. This implies calculating the 2nd order moment of a rational function, which has a complicated closed form. We show that this 2nd order moment can be easily computed by considering orthogonal functions. As an example, we used orthogonal polynomials (Hermite) and trigonometric functions (Fourier) and showed that they can be used to achieve better results in image classification on ImageNet using ConvNeXt and next token prediction on OpenWebText using GPT-2.

📄 Paper: Learnable Polynomial, Trigonometric, and Tropical Activations 💻 Code: torchortho on GitHub

K-H-Ismail avatar Feb 04 '25 10:02 K-H-Ismail