FlyCarrot

Results 4 issues of FlyCarrot

报错如下 模型是deepseek v2-lite ,shard 是8, ``` model.layers.7.mlp.experts.3.w1w3 not in state_dict, loading deepseek-ai/DeepSeek-V2-Lite/model-00002-of-000004.safetensors ```

如题,moe模型 convert的时候有 xtuner/xtuner/utils/handle_moe_load_and_save.py 参与 其中的 print_on_rank0 函数有 ```python def print_on_rank0(info): if dist.get_rank() == 0: print_log(info, 'current') ``` 涉及多卡初始化,但是convert的时候实际没有多卡初始化,因此会报错。 建议修改一下代码,区分一下convert时直接print而不是判断rank

Hi, it's a good code repo, but I don't find the code for calculating logits changed, Could you point to the target code line? Thanks a lot!

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build...

bug