高璟琦
高璟琦
> @ghostplant Yes! And how to specify 'use_gate1' and 'use_gate2'?
> We are going to merge this: https://github.com/microsoft/tutel/pull/71/files You can create new moe layers by specifying a list of original gating types. And when forwarding the moe layer, you can...
@jaekyoungbae Just fixed the issues. Could you pls check it again? I think it's ok to be merged.
My new model is implemented in this pr. https://github.com/OpenGenerativeAI/llm-colosseum/pull/45/files You can watch the video of my model vs mistral at here. https://github.com/Tokkiu/llm-colosseum?tab=readme-ov-file#1-vs-1-mistral-vs-solar