Bo Zheng
Bo Zheng
# Adding Qwen2MoE This PR adds the support of codes for the coming Qwen2MoE models. For information about Qwen, please visit https://github.com/QwenLM/Qwen. @ArthurZucker
Is there an evaluation script that can directly compare a prediction file against the gold prediction file, i.e., the official evaluation script?
# Adding Qwen2MoE This PR adds the support of codes for the coming Qwen2MoE models. For information about Qwen, please visit https://github.com/QwenLM/Qwen.
What is the "PART_I.txt" file used in zh2en preprocessing scripts and where can I download it?
I load the same model trained with megatron + megablocks, and I found the load_balancing_loss is slightly different. When I increase the pipeline_parallel_size, the load_balancing_loss is also increasing. Is it...