Antonio Stano

Results 30 comments of Antonio Stano

@dagelf i knew we could still go further with the cpu, thanks! looking into it

> So maybe this is ok to merge... 1 it looks a little funny is there no way to combine the double nested if into one condition? 2 i think...

did you check if you have OpenMP installed? ```bash # on macOS $ brew install libomp # on ubuntu $ sudo apt-get install libomp-dev ``` after installing build it again...

we have this error when the mean_loss is = -1.0f which is the initial value, and this attribute is only updeted in the gpt2_forward function when the target param is...

-Ofast already enables all the optimizations from -O3 and also other aggressive optimizations, maybe it's better to either use -O3 or -Ofast, not both?

@karpathy or maybe the optim could be custom made llvm passes targeting train_gpt2 and not automatically generated from the -O flags

The -Ofast flag enables the `-ffast-math` flag. The -ffast-math flag allows non-finite values and it can also ignore some mathematical rules that dictate that certain optimizations are not allowed. It's...

Ok I guess I found the problem, now -Ofast and -O3 enabled work togheter, the issue was caused by -ffast-math being enabled messing up with floating point arithmetic. This will...

@Infatoshi Yes, if `CFLAGS = -O3 -Ofast -fno-fast-math -Wno-unused-result`, using only `CFLAGS = -O3 -Wno-unused-result` without -0fast should work

Ok so from what i saw, and to recap, on ubuntu machines (i am on 22.04): ```Makefile CFLAGS = -O3 -Ofast -fno-fast-math -Wno-unused-result ``` will work. If this doesn't work...