Yukio Siraichi
Yukio Siraichi
@jansel > Hrm, with no speedups perhaps numba isn't the right backend. I tried running the script again, but replacing [these lines](https://github.com/pytorch/pytorch/blob/a05b7b1c73247ff562a82aac0edca79bbaebc2bd/torch/_dynamo/guards.py#L758-L760) with: `symbolic_shape_expression = "(True)"`. This would allow us...
@ezyang @jansel Here's a summary of the changes: - Added C function for accessing the sizes and strides from a `THPVariable*` - Using the C ABI functions with the help...
The profiling data will take some more time...
@ezyang @jansel Here are the benchmarking results before and after this PR's changes. In order to reproduce this benchmark, run `python benchmarks/dynamo/torchbench.py -d cuda --float32 -n 50 --performance --backend aot_eager`...
@ezyang @jansel Here are the results moving the base commit to `eae0f3f5e3a4f8c5a37f1a869284f22f3dc30e0d` (`master` branch on December 13). All the other details are the same as before. One thing I've noticed...
@ezyang @jansel Here are the results moving the base commit to `bc4c324807f1d613970df1c4d762b58d7fe4d8c6` (`master` branch on January 15). Here are a few notes: - `Exprs` corresponds to the number of generated...
As far as I could see, most of the CI failures are due to Numba recursion depth limit in `numba/core/controlflow.py` (line 643). Here's what I think is happening: - Guards...
@ezyang @jansel Here are the results moving the base commit to `bc4c324807f1d613970df1c4d762b58d7fe4d8c6` (`master` branch on January 15). Here are a few notes: - I have used TorchInductor `CppCodeCache` as backend...
Yes. As far as I have tested, it only happens when I use `CppCodeCache` even when not actually calling the compiled function.
I have updated [the performance results data](https://github.com/pytorch/pytorch/pull/89707#issuecomment-1407571595), since I finished running all the benchmarks. The geometric means went up a bit: - 1.0 => 1.0388 (vs. the master branch) -...