Brian Hirsh
Brian Hirsh
I think i've addressed all of the feedback except https://github.com/pytorch/pytorch/pull/92857/files#r1084714158. The important bit is that I updated `create_joint_forward_backward_functionalized` into `create_forward_or_joint_functionalized`, which tries to abstract away the differences between dispatch_base vs...
From some of the failing tests: I thought that the inference case wouldn't require special handling for outputs that are aliases of inputs, but that actually isn't quite right. Given...
> To bring optimizers to their todays state we'd need https://github.com/pytorch/pytorch/issues/91310 implemented too? yep, agreed with all the above. I actually tried kicking off a benchmark yesterday but realized I...
After talking to @ngimel , tentative plan (lmk if anything has thoughts): (1) benchmark, confirm that we see noticeable regressions on some models due to optimizer step getting slower (2)...
This is ready for another round. **Major changes** are: - I added a flag to aot_module/function, `keep_inference_input_mutations` (off by default, on for inductor) - That flags instructs the aot_dispatch_base codepath...
I fixed a few unit tests, although I think... `python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive_nn_functional_cosine_embedding_loss_cpu_bool` seems to be failing nondeterministically for me locally. The aot_eager variant appears to be fine. I'll see what...
Ok, that test is deterministically failing. I also confirmed that it only fails on inductor, and not aot_eager, and it fails for both cpu and cuda. I have a separate...
There are still more failures from CI to hit, so I'll start to look over them.
marked a few failures as low-pri for now. Currently looking at the tts_angular failure.
I'll focus on the existing CI failures first, then pivot to the (very valid) PR feedback. On the errors: (1) most concerning was the `sebotnet33ts_256` accuracy failure. I have that...