chenyu
chenyu
see https://github.com/tinygrad/tinygrad/issues/1801
``` from tinygrad import Tensor, GlobalCounters a = Tensor.rand(3, 4).realize() b = Tensor.rand(3, 4).realize() w = Tensor([0.3]).realize() GlobalCounters.reset() (a+(b-a)*w).realize() print(f"(a+(b-a)*w) {GlobalCounters.kernel_count=}") GlobalCounters.reset() (a*(1-w)+b*w).realize() print(f"(a*(1-w)+b*w) {GlobalCounters.kernel_count=}") ``` ``` (a+(b-a)*w) GlobalCounters.kernel_count=1 (a*(1-w)+b*w)...
ops not reflecting the underlying compute if underly instructions are fused ``` from tinygrad import Tensor, GlobalCounters a = Tensor.rand(3, 4).realize() b = Tensor.rand(3, 4).realize() GlobalCounters.reset() (a+b).realize() print(f"a+b {GlobalCounters.global_ops=}") GlobalCounters.reset()...
fix max not defined for dtypes like uint8.
removed pattern `x * (-1) -> -x` and `x != True` investigating why multireduce tests in `test_linearizer` are failing in _assert_valid_uop...
`PYTHONPATH="." GPU=1 IMAGE=0 python -m pytest test/test_ops.py -k test_gemm_fp16` passed `PYTHONPATH="." GPU=1 IMAGE=2 python -m pytest test/test_ops.py -k test_gemm_fp16` failed with `Exception: forward pass failed shape (64, 64): dtype mismatch:...