MoE-Infinity icon indicating copy to clipboard operation
MoE-Infinity copied to clipboard

feat: Merge kernels from vLLM and FlashInfer

Open drunkcoding opened this issue 5 months ago • 0 comments

Description

Fuse MoE layer kernels

Motivation

Kernel launch overhead too large

Type of Change

  • [ ] Bug fix
  • [x] New feature
  • [x] Breaking change
  • [x] Documentation update

Checklist

  • [x] I have read the CONTRIBUTION guide.
  • [x] I have updated the tests (if applicable).
  • [x] I have updated the documentation (if applicable).

drunkcoding avatar Jul 06 '25 19:07 drunkcoding