Jason Ansel comments

Results 199 comments of


                                            Jason Ansel

[Inductor] Support tiling reduction dimensions

> I agree that the current 2D heuristics are pretty lazy. After the refactor PRs land, I can revisit those and come up with something more sensible, with data to...

[AOTI XPU] Enable Cpp wraper for Intel GPU.

> Hi, @jansel , the root cause of the test failures is descripted at issue #136940 , those cases also fails on main branch. Please rebase the PR to a...

[AOTI XPU] Enable Cpp wraper for Intel GPU.

@desertfire should be main reviewer on this one

[Inductor] New Triton Attrs Descriptor Fixups

@pytorchbot merge

Python bindings don't support bfloat16 (yet)

@steven-johnson I'm implementing a Halide backend for PyTorch/torch.compile/TorchInductor. Early work-in-progress version here: https://github.com/pytorch/pytorch/pull/126417 For this backend I am using the Halide-Python bindings to define a `hl.generator` that generates the kernel...

[Quant] Can quant not be decomposed on inductor?

Is dequantize impure? What is it mutating? IMO this op should be decomposed in inductor. You can register the decomp in the same place the op is defined.

[Quant] Can quant not be decomposed on inductor?

Impure isn't what you are looking for. Impure means the op mutates one of its inputs, so when we functionalize we need to introduce more copies (which might increase memory...

[Quant] Can quant not be decomposed on inductor?

I don't believe we have a dont-constant-fold flag (correct me if I'm wrong @eellison ), though maybe we should.

Add SDPA patterns for T5 models

@pytorchbot rebase Looks like tests are failing

Autoschedulers fail on indirect loads

Thanks, I'll switch to using clamp. Is there a way to get halide to generate masked loads (using the hardware mask registers on GPUs)? In some cases many of the...