llm.c icon indicating copy to clipboard operation
llm.c copied to clipboard

Precompute the scaling factor in gelu_forward and gelu_backward

Open ryanmcdermott opened this issue 10 months ago • 2 comments

Thank you so much for the amazing repo!

The scaling factor in gelu_forward and gelu_backward is computed each time the functions are run: float s = sqrtf(2.0f / M_PI);

This can be precomputed ahead of time and stored as a constant.

ryanmcdermott avatar Apr 11 '24 04:04 ryanmcdermott

Is this good enough?

ayushanshul07 avatar Apr 11 '24 07:04 ayushanshul07

Doesn't it get optimised away by the compiler anyway? (Haven't actually checked though) Plus pointwise operations are bandwidth-limited anyway, so adding/removing a few flops shouldn't make a difference.

andylolu2 avatar Apr 11 '24 12:04 andylolu2

fixed

karpathy avatar Apr 11 '24 16:04 karpathy

@ryanmcdermott Isn't #defining constant is just replacement and the actual calculation is still repeated? https://github.com/karpathy/llm.c/commit/f26cf00a615e4f643a3660071e4ef86010045db7

ayushanshul07 avatar Apr 11 '24 17:04 ayushanshul07