FumoTime
Results
2
issues of
FumoTime
When running 03-matrix-multiply the performance is much lower compared to rocBLAS ``` M N K rocBLAS Triton 0 1024.0 1024.0 1024.0 21.770480 3.301941 1 2048.0 2048.0 2048.0 25.513268 3.196135 2...
### Problem Description Trying out torch.compile via torch_migraphx and using the example code in torch_migraphx/examples/dynamo/stable_diffusion (but compiling only the unet) does not seem to give a performance increase. Passing in...