ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[FEATURE]: Expert Parallel for qwen/deepseek

Open Guodanding opened this issue 11 months ago β€’ 4 comments

Describe the feature

Hello, are there any existing implementations of expert parallel code for the new MoE model, like qwen and deepseek?

Guodanding avatar Jan 12 '25 14:01 Guodanding

need FP8 training deepseek-MOE

shiyongde avatar Feb 19 '25 01:02 shiyongde

Bot detected the issue body's language is not English, translate it automatically. πŸ‘―πŸ‘­πŸ»πŸ§‘β€πŸ€β€πŸ§‘πŸ‘«πŸ§‘πŸΏβ€πŸ€β€πŸ§‘πŸ»πŸ‘©πŸΎβ€πŸ€β€πŸ‘¨πŸΏπŸ‘¬πŸΏ


need FP8 training deepseek-MOE

Issues-translate-bot avatar Feb 19 '25 01:02 Issues-translate-bot

EP for Deepseek V3 is implemented, see our latest blog.

ver217 avatar Feb 20 '25 04:02 ver217

need FP8 training deepseek-MOE

FP8 gemm kernel released by deepseek github repo now is less efficient than BF16 gemm provided by cublas sometimes. We will release blockwise FP8 training feature until we resolve the efficiency issue.

ver217 avatar Feb 20 '25 04:02 ver217