triton
triton copied to clipboard
[FRONTEND] Reduce kernel launch overhead using type hints
Hey, any news regarding this issue please? :-)
Do you feel confidence that at some point we will be able to write cuda kernels (with Triton) without having a huge bottleneck because of the cpu overhead (cc: #416)? Thanks!
I wonder what we can do currently for a temporal fix? E.g. where to add type hints? thanks.
running into the same issue
could we have any temp fixes now?
@Adel-Moumen @Akimoto-Cris @void-main Been working on some tools for kernel compilation directly from jitted functions.
Happy to tailor these tools for your use case.
@jeromeku: I am interesting in testing the tools that you have created. How do I get in touch with you?