llm.c icon indicating copy to clipboard operation
llm.c copied to clipboard

[todo] Accumulate in double instead of float

Open karpathy opened this issue 2 months ago • 2 comments

I have not kept good hygiene on using double for accumulates everywhere that you have local register variables. Accumulate should be in double, and then read/write in float, todo to fix everywhere, part by part. Doing this most likely results in much better precision/accuracy at near zero cost of additional compute/memory.

Example accumulate that should have been a double instead of float:

 float m = 0.0f;
for (int i = 0; i < C; i++) {
    m += x[i];
}
m = m/C;

karpathy avatar Apr 16 '24 01:04 karpathy

Are you talking about these warnings? train_gpt2.c(232,17): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data train_gpt2.c(304,31): warning C4244: 'argument': conversion from 'int' to 'float', possible loss of data train_gpt2.c(304,17): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data train_gpt2.c(362,43): warning C4305: 'argument': truncation from 'double' to 'float' train_gpt2.c(370,26): warning C4305: 'argument': truncation from 'double' to 'float' train_gpt2.c(374,77): warning C4305: 'argument': truncation from 'double' to 'float' train_gpt2.c(934,47): warning C4244: 'argument': conversion from 'int' to 'float', possible loss of data train_gpt2.c(935,47): warning C4244: 'argument': conversion from 'int' to 'float', possible loss of data train_gpt2.cu(728): warning C4244: 'argument': conversion from 'int' to 'float', possible loss of data train_gpt2.cu(728): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data

ross-wheeler avatar Apr 17 '24 00:04 ross-wheeler

Kahan summation is well used technique to reduce numerical error in summation computation

https://en.m.wikipedia.org/wiki/Kahan_summation_algorithm

Its rather simple to implement.

syoyo avatar Apr 18 '24 12:04 syoyo

Putting in here a summary of the discussion of the result of this CR: https://github.com/karpathy/llm.c/pull/221 There is a signifigant slowdown on the order of magnitude of 32x for using doubles compared to floats on gaming gpu's and by using doubles instead of floats for the summation in this location led to a 2x slowdown even though the variance in error was reduced. Will have to be done on a case by case basis looking at the tradeoff of CPU vs Memory bound kernels.

ChrisDryden avatar Apr 22 '24 18:04 ChrisDryden