Brian Hirsh comments

Results 100 comments of


                                            Brian Hirsh

turn functionalization on in aot_autograd inference

Updated. I addressed + commented on all of the targeted PR feedback. The majority of the material changes were: - Moved create_forward_or_joint_functionalized and it’s layered functions all to be global...

turn functionalization on in aot_autograd inference

After talking to Natalia, the failures that I sent to her all seemed to be either flaky (existing before this PR too), or mild accuracy differences where we're ok upping...

turn functionalization on in aot_autograd inference

`jx_nest_base` also failed on the last CI run when run with dynamic shapes, but passes when I run it locally. The error also seems pretty unrelated to this PR: ```...

turn functionalization on in aot_autograd inference

@ngimel it looks like `python test/inductor/test_torchinductor.py CudaTests.test_argmax_argmin2_cuda` is flaky too - confirmed locally that it errors for me on the commit below my PR.

turn functionalization on in aot_autograd inference

Interesting new failure that didn't show up before: There's an input mutation test in inductor that involves mutating the input twice, that seems to fail on the CPU backend but...

turn functionalization on in aot_autograd inference

I have a repro that doesn't include AOTAutograd - it only includes inductor's `compile_fx_inner`. It also looks like my repro passes on a commit from 2 days ago, `76ed1a81d14f18d6078f11d525aafe5de694cadb`, but...

turn functionalization on in aot_autograd inference

It looks like https://github.com/pytorch/pytorch/pull/94110 is the culprit - the above repro passes on the commit before on master (`7ce785b50b15e50b5aff9f62451a0c1f01b03f03`), but fails on that commit (ceab30775b80306f10a08b2f1e3a4d11b1835a75). Filing an issue. cc @ngimel...

turn functionalization on in aot_autograd inference

The other failing set of tests was `python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive_resize__cpu_float16` - I just added an explicit fallback for `resize` and `resize_as` to `lowerings.py` to get around it (I think there...

turn functionalization on in aot_autograd inference

@pytorchbot merge

[prototype, do not land] POC of torch.compiling wrapper tensor subclasses (torchquant)

@wconstab it sounds like that subclass plays a similar role of translating "ops on a subclass" into "ops on dense tensors, with distributed collectives added at the right points into...