composable_kernel
composable_kernel copied to clipboard
Wip 355 xcd remap
Added XCD remapping for flatmm moe
| batch | Mixtral (tflops, wip_355) | Mixtral-7B (tflops, our branch) | perf boost |
|---|---|---|---|
| 64 | 865.424 | 995.455 | 15.0% |
| 256 | 886.336 | 1020.96 | 15.2% |
| 1024 | 890.808 | 1022.53 | 14.8% |
coauthor: @Chi-Chu319 @juuso-oskari