llm.c
llm.c copied to clipboard
fix: tensor checking change fabs to fabsf
trafficstars
I noticed that test vars were using float and based on clang analyzer I changed to float and the tests were passed.
[GPT-2]
max_seq_len: 1024
vocab_size: 50257
num_layers: 12
num_heads: 12
channels: 768
num_parameters: 124439808
[State]
batch_size: 4
seq_len: 64
num_activations: 73323776
-43.431686 -43.431671
-39.836395 -39.836403
-43.065956 -43.065933
OK (LOGITS)
LOSS OK: 5.270003 5.270007
dwte
OK -0.002320 -0.002320
OK 0.002072 0.002072
OK 0.003717 0.003717
OK 0.001307 0.001307
OK 0.000632 0.000632
TENSOR OK
dwpe
OK -0.005111 -0.005110
OK -0.000012 -0.000012
OK -0.003262 -0.003262
OK 0.009909 0.009909
OK 0.002146 0.002145
TENSOR OK
dln1w
OK -0.007523 -0.007523
OK 0.008643 0.008643
OK 0.005027 0.005029
OK -0.011094 -0.011095
OK -0.001663 -0.001664
TENSOR OK
dln1b
OK -0.038458 -0.038458
OK -0.030593 -0.030600
OK 0.010217 0.010223
OK 0.080177 0.080176
OK -0.060902 -0.060901
TENSOR OK
dqkvw
OK -0.000031 -0.000031
OK -0.000025 -0.000025
OK -0.000064 -0.000064
OK 0.000074 0.000074
OK 0.000020 0.000020
TENSOR OK
dqkvb
OK -0.000411 -0.000411
OK -0.000412 -0.000412
OK 0.000114 0.000113
OK -0.000565 -0.000565
OK 0.000571 0.000570
TENSOR OK
dattprojw
OK 0.000080 0.000080
OK -0.000005 -0.000005
OK -0.000019 -0.000019
OK 0.000004 0.000004
OK 0.000032 0.000031
TENSOR OK
dattprojb
OK 0.000471 0.000470
OK -0.009980 -0.009979
OK -0.001804 -0.001804
OK 0.037578 0.037584
OK -0.031235 -0.031239
TENSOR OK
dln2w
OK -0.018315 -0.018312
OK 0.004812 0.004813
OK 0.008089 0.008091
OK -0.001470 -0.001470
OK -0.002737 -0.002737
TENSOR OK
dln2b
OK -0.026373 -0.026368
OK -0.016702 -0.016695
OK 0.001071 0.001074
OK 0.034705 0.034711
OK -0.028581 -0.028584
TENSOR OK
dfcw
OK 0.000440 0.000440
OK -0.000000 -0.000000
OK -0.000154 -0.000154
OK -0.000165 -0.000165
OK 0.000405 0.000405
TENSOR OK
dfcb
OK 0.003293 0.003293
OK 0.002043 0.002043
OK -0.001386 -0.001386
OK 0.000386 0.000386
OK 0.001603 0.001604
TENSOR OK
dfcprojw
OK 0.000681 0.000681
OK 0.000073 0.000073
OK -0.000416 -0.000416
OK -0.000061 -0.000061
OK -0.000604 -0.000604
TENSOR OK
dfcprojb
OK 0.003584 0.003584
OK -0.007158 -0.007158
OK -0.001963 -0.001964
OK 0.001462 0.001462
OK 0.001217 0.001217
TENSOR OK
dlnfw
OK -0.000022 -0.000022
OK 0.000810 0.000811
OK 0.001161 0.001161
OK -0.002957 -0.002957
OK 0.001145 0.001145
TENSOR OK
dlnfb
OK -0.011100 -0.011101
OK 0.008009 0.008007
OK -0.004771 -0.004769
OK -0.002112 -0.002113
OK -0.005905 -0.005905
TENSOR OK
step 0: loss 5.270003 (took 1940.382000 ms)
step 1: loss 4.059711 (took 1849.022000 ms)
step 2: loss 3.375073 (took 1857.843000 ms)
step 3: loss 2.800822 (took 1820.270000 ms)
step 4: loss 2.315457 (took 1820.448000 ms)
step 5: loss 1.849130 (took 1854.129000 ms)
step 6: loss 1.394807 (took 1845.331000 ms)
step 7: loss 0.999210 (took 1856.497000 ms)
step 8: loss 0.624221 (took 1867.062000 ms)
step 9: loss 0.376609 (took 1852.179000 ms)
loss ok at step 0: 5.270003 5.270007
loss ok at step 1: 4.059711 4.059707
loss ok at step 2: 3.375073 3.375123
loss ok at step 3: 2.800822 2.800783
loss ok at step 4: 2.315457 2.315382
loss ok at step 5: 1.849130 1.849029
loss ok at step 6: 1.394807 1.394656
loss ok at step 7: 0.999210 0.999147
loss ok at step 8: 0.624221 0.624080
loss ok at step 9: 0.376609 0.376511
overall okay: 1