qazal
qazal
yay! green CI, is the diff good for review?
your diff shouldn't include any noqa for long lines, can you try to break the code down?
how much did you benchmark this against: PROFILE=0 python3 test/external/external_benchmark_schedule.py is it fast?
so I think you can break this into two PRs: 1. Rewrite all gated stores to UOps.IF 2. Add merge_gates Otherwise your PR looks good, smaller diffs help iterate on...
hmm, interesting. Why would it not pass? If it doesn't take much of your time I'd really wanna see why just rewriting to UOps.IF doesn't pass the tests.
you're having that problem because those IF srcs are wrong. My expected output is: (this is `python3 test/test_linearizer.py TestLinearizer.test_padto_max_multireduce`) [This](https://github.com/tinygrad/tinygrad/pull/5976#discussion_r1730038001) is also because of a bug in this diff.
better, can you cleanup the diff (removing merge_gates and stuff)? I'll take ownership of that process replay red and merge if all is good.
Process replay failed because this is somehow making things non deterministic. printed a full log in: https://github.com/tinygrad/tinygrad/actions/runs/10558375543/job/29247702829#step:24:61
hmm, also broke local reduce gating:  TestLinearizerFailures.test_failure_43
test_failure_43 is just wrong, see where the IF is? The tests don't just fail on AMD, just copying the AST and applied opts in the logs will show you how...