[QUERY] Expert Parallelism Supported?

Open Shamauk opened this issue 1 year ago • 0 comments

Looking at the engine and I would like to run inference with the Mixtral model while doing expert parallelism. I see that DeepSpeed itself seems to have some support but I saw a post where I should use mii if I want to run inference with Mixtral.

Anyone knows? If not – anyone knows of an alternative system?

Jun 27 '24 14:06 Shamauk