Azret Botash

Results 29 comments of Azret Botash

see https://github.com/karpathy/llm.c/pull/156

Please don't forget about the MSVC/Windows. MSVC uses pragma to turn off the optimization. #pragma optimize( "", off ) /* unoptimized code section */ #pragma optimize( "", on ) This...

> > Please don't forget about the MSVC/Windows. MSVC uses pragma to turn off the optimization. > > #pragma optimize( "", off ) /* unoptimized code section */ #pragma optimize(...

Would you be able to port it to atomic_compare_exchange_weak?

``` void atomic_add(_Atomic float* dest, float val) { float old = *dest; float new_value; do { new_value = old + val; } while (!atomic_compare_exchange_weak(dest, &old, new_value)); } ``` Does not...

Try this: void test() { // verify basic case.... expect about ~ 1.0 float sum = 0.999991; atomicAdd(&sum, 0.000009); int B = 1024; int T = 1024; int C =...

We might need atomic accumulation in the back pass if the number of collisions to the same bucket is significant that it will screw up the gradient. And I say...

Yeah I think so. I was looking at the same thing trying to wrap my head around why the matmult does not report any errors when **int b = bt...

The term in bold gets canceled out. float* x = _In + **b * T * C** + t * C; float* y = _Out + **b * T *...