tvm
tvm copied to clipboard
[CUBLAS][FP8] Enable fusing astype operation for matmul multiply pattern
This PR adds fusing of the astype operation to matmul for cublas. This change is needed to improve the performance for fp8.
do we need to update cublas codegen or runtime to support the cast?