高璟琦

Results 15 comments of 高璟琦

> @ghostplant Yes! And how to specify 'use_gate1' and 'use_gate2'?

> We are going to merge this: https://github.com/microsoft/tutel/pull/71/files You can create new moe layers by specifying a list of original gating types. And when forwarding the moe layer, you can...

@jaekyoungbae Just fixed the issues. Could you pls check it again? I think it's ok to be merged.

My new model is implemented in this pr. https://github.com/OpenGenerativeAI/llm-colosseum/pull/45/files You can watch the video of my model vs mistral at here. https://github.com/Tokkiu/llm-colosseum?tab=readme-ov-file#1-vs-1-mistral-vs-solar