chenyu issues

Results 80 issues of


                                            chenyu

alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1)

fix pad constant mode a symbolic amount

default for resolve should be False, otherwise the unresolved case might skip the shrink

alu of tensors on different devices give incomprehensible errors

``` from tinygrad import Tensor, Device a = Tensor([0, 1, 2, 3], device=f"{Device.DEFAULT}") b = Tensor([0, 1, 2, 3], device=f"{Device.DEFAULT}:1") print(f"{(a+b).tolist()}") ``` ``` Traceback (most recent call last): File "/Users/chenyu/code/tinygrad/test.py",...

remove add in pad when padding with non-zero

this is simpler because shapetracker is shared ``` from tinygrad import Tensor Tensor([1, 2, 3]).pad(((1, 1),), 4).realize() ``` with NOOPT=1 this ``` int gidx0 = gid.x; /* 5 */ bool...

jit behaves differently based on output assignment

``` from tinygrad import Tensor, TinyJit @TinyJit def f(a, b): return a + b a = Tensor([1, 2, 3]) b = Tensor([4, 5, 6]) for _ in range(5): f(a, b)...

idx valid simplification in to_indexed_uop

better generate dataset process

i am generating one to unblock the ALU op migration, and need a better way to manage this so that it does not block contribution

CLANG has different casting behavior with intermediate inf

from https://github.com/tinygrad/tinygrad/issues/7421#issuecomment-2453046963 `CLANG=1 python -c "from tinygrad import Tensor; print(Tensor([30000], dtype='half').mul(200).float().item())"` returns 6000000.0 which is larger than max fp16.

move simplify_valid to symbolic

alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1)

only do if at least one branch is const, so total alu won't increase