Antonio Stano
Antonio Stano
Ok, i will try this change in train_gpt2.c and then build it with your original Makefile, since i also had the issue as lot of others ubuntu machines. I updated...
so @karpathy the issue seems to be only on ubuntu systems (or linux in general), on macos works fine. on ubuntu it's solved with the previous suggestions, maybe a comment...
@bexcite yes, apparently the flag -ffast-math messes up the floating point arithmetic, but does it work on your machine with -Ofast enabled and -fno-fast-math to disable -ffast-math? Also -ffast-math does...
yes, this is what I mentioned here for ubuntu machines: > Ok so from what i saw, and to recap, on ubuntu machines (i am on 22.04): > > ```makefile...
Yes removing -O3 or -Ofast will make the code slower, i don't know how much -ffast-math influence the speed, and also -Ofast is just a wrapper of -O3 with more...
[#21] what i have is: - `CFLAGS = -Ofast -fno-fast-math -Wno-unused-result` : ```bash step 0: train loss 5.356082 (took 19942.187432 ms) step 1: train loss 4.300639 (took 20333.230115 ms) step...
_guys wake up, new optimization just dropped_, check [#21], we found a better alternative to -fno-fast-math for almost 2x performance improvements, try to use: `CFLAGS = -Ofast -fno-finite-math-only -Wno-unused-result`
looking into it!
little update, the problem seems to be with -Ofast, using -O3 and -Ofast togheter throws the error #19, using only -Ofast also throws the error. With only -O3 it works,...
i found out that -Ofast enables a lot of hard optimizations that can alter the behaviour of the program, in particularly it enables the flag -ffast-math, which can mess the...