xtuner icon indicating copy to clipboard operation
xtuner copied to clipboard

deepseek v2 使用shard模式做训练,在load权重的部分 报missing w1w3这类key的情况

Open FlyCarrot opened this issue 1 year ago • 0 comments

报错如下 模型是deepseek v2-lite ,shard 是8,

model.layers.7.mlp.experts.3.w1w3 not in state_dict, loading deepseek-ai/DeepSeek-V2-Lite/model-00002-of-000004.safetensors

FlyCarrot avatar Aug 01 '24 06:08 FlyCarrot