Brian Hirsh comments

Results 146 comments of


                                            Brian Hirsh

FX pass to move input mutations into submodule

Ok, I beefed up this PR (so mutations get taken out before parititioning) and added better testing. @Chillee lmk what you think of the AOTAutograd changes. I also dumped my...

FX pass to move input mutations into submodule

It looks like this pass isn't playing well with the partitioning code - it fails `python test/test_pythonkey.py TestAOTAutograd.test_batchnorm` when functionalization is toggled on. I think the partitioning code needs some...

FX pass to move input mutations into submodule

I added an expect test in `test_compile_cache.py` (maybe I should move it somewhere else) - it also doesn't include a before, which I'll add: ``` # Function takes in 4...

`functionalize()` doesn't properly handle aliased program inputs

> Can't we just make a synthetic base? You know the size and the stride of each of the input tensors, you know they share a storage, so you can...

`functionalize()` doesn't properly handle aliased program inputs

Talked more with Ed offline. Summarizing: Agreed that we can create synthesize a base in all cases, and we don't need the `._base` attribute. How? For all program inputs that...

Jacobian of matrix exponential yields an "in-place" runtime error

I have a local fix for `jacfwd(functionalize(f))`, but it doesn't fix the problem. Is this a composite compliance issue? It looks like `matrix_exp` isn't a composite op ([native function entry](https://github.com/pytorch/pytorch/blob/35545d85dc69687c4fc6f5fbab575ca9079624a3/aten/src/ATen/native/native_functions.yaml#L11003)).

Jacobian of matrix exponential yields an "in-place" runtime error

Oh hmm, it looks like it's because the derivative formula for `matrix_exp` isn't "composite compliant" (although in this case, the derivative formula isn't an op, it's just a function. Defined...

Op(s) not lowered: aten::_weight_norm_interface

From some grepping around, it looks like there's a `torch.nn.utils.weight_norm()`, which called `torch._weight_norm`. In C++, `torch._weight_norm` maps to `at::_weight_norm`, which eventually calls `_weight_norm_interface`: https://github.com/pytorch/pytorch/blob/d58ced3db74dfa6e86ecacc98bc8c06219a0546c/aten/src/ATen/native/WeightNorm.cpp#L93

Cast liner model to bf16 produces unexpected f32

hmm @soulitzer, do you know what the purpose of that line in autograd ([here](https://github.com/pytorch/pytorch/blob/23088fcfdf77632d4e6db4d35ce62735ca6622d2/torch/csrc/autograd/engine.cpp#L743)) is? Is the idea that if we downcast a tensor in the forward, then we need...

FX pass to move input mutations into submodule

I'm pushing some more on this based on https://github.com/pytorch/pytorch/issues/85036, since adding an epilogue to AOTAutograd should unblock a few models that were previously hitting dynamo's fallback. This isn't ready for...