Jason Ansel comments

Results 199 comments of


                                            Jason Ansel

[inductor] Cooperative reductions

@pytorchbot merge

[inductor] Cooperative reductions

I am skipping the ROCM tests and making an issue to adding AMD support #139099

[inductor] Cooperative reductions

@pytorchbot merge

torch.compile mode="max-autotune" precision appears to be lower

Can you check if this is fixed by: ``` torch.backends.cuda.matmul.allow_tf32 = False torch.backends.cudnn.allow_tf32 = False ``` I suspect this is a result of compile being better at enabling tf32 than...

torch.compile mode="max-autotune" precision appears to be lower

@shunting314 would you mind looking at this again?

torch.compile mode="max-autotune" precision appears to be lower

cc @eellison any ideas on why there would be a difference related to cudagraphs?

OpenTuner: Clarification on Required Command-Line Arguments

When you create your ArgumentParser, you should do ``` argparse.ArgumentParser(parents=opentuner.argparsers()) ``` `parents=opentuner.argparsers()` will add all the needed OpenTuner args, including default values. If you don't want the opentuner args on...

[inductor] Split reduction loops when there is no shared reads

Hrm weird. Seems to help benchmarks on average though.

[inductor] Split reduction loops when there is no shared reads

@pytorchbot merge

[inductor] Cleanup generate_node_schedule

@pytorchbot merge