composable_kernel
composable_kernel copied to clipboard
Refactor ck_tile fMHA forward example
we need to wait for @danyao12 merge his fmha bwd & dropout changes then refactor all the updated example codes together.
I will continue developing the fmha fwd + KV cache reference function base on current design of HostTensor<>
.
this PR is no longer needed.