Megatron-LM
Megatron-LM copied to clipboard
[QUESTION] When will model have `_extra_state`?
After updating to the main branch of Megatron-LM recently, I met this error when loading model:
Unexpected key(s) in state_dict: "decoder.layers.0.self_attention.core_attention._extra_state"
The checkpoint is transformed by the tools/checkpoint/convert.py
, and loaded by pretrain_gpt.py
.