Satya Bhagavan
Satya Bhagavan
**What is your question?** The mainloop fusion examples provided in [25_ampere_fprop_mainloop_fusion ](https://github.com/NVIDIA/cutlass/tree/main/examples/25_ampere_fprop_mainloop_fusion) and [26_ampere_wgrad_mainloop_fusion](https://github.com/NVIDIA/cutlass/tree/main/examples/26_ampere_wgrad_mainloop_fusion) use half-precision (float16). I want to adapt these examples to work with single-precision (float32). I changed...
**What is your question?** In the examples provided, EVT demonstrates the capability to fuse different epilogue functions, optimizing their execution. I'm interested in knowing whether EVT can also integrate the...