DeepSpeed
DeepSpeed copied to clipboard
[BUG] Error happened when running step3_rlhf_finetuning in enable_hybrid_engine mode with togethercomputer/GPT-NeoXT-Chat-Base-20B
I have reported an issue in DeepSpeedExamples: https://github.com/microsoft/DeepSpeedExamples/issues/448
For in-depth analysis, I saw the definition of module for gptneox in the following file: https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/module_inject/containers/gptneox.py
See the following calling relationship: DeepSpeedGPTInference -> DeepSpeedTransformerInference -> DeepSpeedSelfAttention
The implementation of DeepSpeedSelfAttention is seemed to be inconsistent with that of huggingface gptneox (https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt_neox/modeling_gpt_neox.py):
for example, the implementation of GPTNeoXAttention in huggingface, it includes RotaryEmbedding. but DeepSpeedSelfAttention seemed to have nothing with RotaryEmbedding. I also found that other model like DS_GPTNEO, DS_BERT are also using the implementation of DeepSpeedSelfAttention.
so far, it can run for facebook/opt-1.3b with --enable_hybrid_engine successfully. but failed for GPT-NeoXT-Chat-Base-20B.
Can you help me to find out the problem. Thanks.