qazal

Results 187 comments of qazal

looks good, can you resolve the conflicts?

thanks - adding this to the scheduler roadmap!

I need to think about the scheduler change a bit more, but in general we don't wanna do merge schedules, if there is grouping to be done it should be...

I think all of your fusion targets are children of `E_2048` https://tiny-tools-client.vercel.app/?id=3ef8c4a72b0c4999acca0dff9288b2fa could traversing its local graph work?

> forward -> BN forward -> stuff -> fusion targets If you fuse those targets the doesn't the cache fill up with a bunch of the "stuff" bufs? We wanna...

Moved everything to the scheduler, TestSchedule.test_fold_conv_relu_backward is green. Tomorrow: - the reshapes and permutes don't generalize to the arange fusion. Is there some sort of generic ShapeTracker witchcraft that'd handle...

Makes sense. I think the existing tests (specially external_test_opt!) where great in helping reason about the fusion approach. This PR includes tests covering all the possible cases for the new...

This will be merged after fusing all self-contained subgraphs (we'll get it for free) For now, there are no major speedup benefits from fusing realized reduces.

Status: This reduces resnet training time to 109h54m - I'm working on simplifying the diff. I think this is a good opportunity to generalize scheduler fusion in reduce_for_op.

will reopen once I get to pick this back up. For now, we have r3 that does the same thing as this one would do in less kernels. But some...