MoE-Infinity icon indicating copy to clipboard operation
MoE-Infinity copied to clipboard

feat: performance improvement and Qwen3 support

Open drunkcoding opened this issue 7 months ago • 0 comments

Description

Major changes for performance improvement

Motivation

  • Support latest QWen3 MoE model
  • Overlap hidden states gather with expert copy
  • Reduce torch kernel launch overhead

Type of Change

  • [ ] Bug fix
  • [x] New feature
  • [x] Breaking change
  • [x] Documentation update

Checklist

  • [x] I have read the CONTRIBUTION guide.
  • [ ] I have updated the tests (if applicable).
  • [x] I have updated the documentation (if applicable).

drunkcoding avatar May 11 '25 20:05 drunkcoding