chenyu

Results 80 issues of chenyu

I would like to update the loss numbers in the comment section too, but I could not reproduce your numbers. For repro.py, I am getting 23 eval: split train. loss...

Simplify the logic, also fix an inconsistency that `(a*6+b*6)//16 -> (((a*6)+(b*6))//16)` but `(a*6+b*6+16)//16 -> ((((a*3)+(b*3))//8)+1)`. With the change now `(a*6+b*6)//16 -> (((a*3)+(b*3))//8)` too.

Metal by default enables fast-math, which may violate IEEE 754 standard (https://developer.apple.com/documentation/metal/mtlcompileoptions/1515914-fastmathenabled). With fast-math, `rand * 2 * pi` (used in randn implementation) will be different from numpy about 30%...

added negative int for int case TODO: test this on real GPU. test passed on tinybox

I can hit `RuntimeError: Error Domain=MTLCommandBufferErrorDomain Code=1 "Discarded (victim of GPU error/recovery)` on M1 Max pretty consistently with `WINO=1 PYTHONPATH=. python examples/beautiful_mnist.py`. Seems fine with `JIT=2`. Also with `WINO=1` when...

don't remember why this did not work, is it precision?

- [x] fix assign issue and beautiful_mnist. #3878 - [ ] Add assign spec as tests. #3806 #3807 - [x] fix flaky beam gpt2 test on nvidia benchmark. #3808 #3812...

beam with gpt2 and output verification has caught many issues recently. add something similar to regular ci

Adam is broken after #3763. Changed Adam optimizer in beautiful_mnist to LAMB can make it train. need to understand scheduler with assign better and more robust tests for the optimizers