pytorch
pytorch copied to clipboard
Cuda Kernel and Scheduled IR print functions from FusionDefinition
🚀 The feature, motivation and pitch
Add Cuda Kernel and Scheduled IR print functions to the FusionDefinition in python. Perhaps this an API?
fd.cuda_kernel(inputs)
fd.last_executed_cuda_kernel()
fd.scheduled_ir(inputs)
fd.last_scheduled_ir()
cc @mruberry
Ivan Reports:
It should also print to Python console output, not the terminal. In Colab fusion.print() is visible only in the logs, not the usual cell output.
Make sure this is fixed.
It was suggested that we think about iterating over the available kernels for a particular fusion. That is, currently, not possible given the opaque nature of the FusionExecutorCache to the Python frontend, but we could think about exposing iteration.
Christian has mentioned that printing kernels and scheduled IRs is likely gated on effort by @mmigdal-nv to expose printing the kernels from segments instead of dropping in an environment variable.