shunting314 comments

Results 56 comments of


                                            shunting314

[inductor] do benchmark in sub processes for max autotuning

> I'd suggest doing all codegen in the parent process, then just sending a filename+sizes/strides/offsets/dtypes to the subprocess. For extern kernels, you can replace the filename with the call_name. I...

[inductor] do benchmark in sub processes for max autotuning

I'm thinking it would be easier to let this PR handle TritonTemplateCaller only for now since 1. this makes it simpler and we can always do the same thing for...

[inductor] do benchmark in sub processes for max autotuning

One thing I realized but have not done in this PR is, right now single process autotuning leverage lambdas to represent the tuning tasks while multi process autotuning leverages the...

[inductor] do benchmark in sub processes for max autotuning

@shunting314 has imported this pull request. If you are a Meta employee, you can view this diff [on Phabricator](https://www.internalfb.com/diff/D43996048).

[inductor] do benchmark in sub processes for max autotuning

@shunting314 has imported this pull request. If you are a Meta employee, you can view this diff [on Phabricator](https://www.internalfb.com/diff/D43996048).

[inductor] do benchmark in sub processes for max autotuning

@shunting314 has imported this pull request. If you are a Meta employee, you can view this diff [on Phabricator](https://www.internalfb.com/diff/D43996048).

[inductor] do benchmark in sub processes for max autotuning

@jansel I've update the PR to use BenchmarkRequest for single process case as well. Please take another look, thanks!

[inductor] do benchmark in sub processes for max autotuning

@shunting314 has imported this pull request. If you are a Meta employee, you can view this diff [on Phabricator](https://www.internalfb.com/diff/D43996048).

[inductor] do benchmark in sub processes for max autotuning

@pytorchbot merge

Refactor instance_descriptor for new triton version

For lint issues, you can also test that locally: ``` pip install lintrunner # if lintrunner is not installed yet lintrunner -a ```