Megatron-LM [BUG] The bug about the options of the Megatron-core, transformer-impl and flash-attention.

[BUG] The bug about the options of the Megatron-core, transformer-impl and flash-attention.

Open Baibaifan opened this issue 10 months ago • 2 comments

Describe the bug Open --use-mcore-models and --use-flash-attn, set --transformer-impl local, and do not use flash-attention.

To Reproduce N/A

Expected behavior N/A

Stack trace/logs N/A

Environment (please complete the following information):

Megatron-LM commit ID : ba773259dbe5735fbd91ca41e7f4ded60b335c52

Proposed fix N/A

Additional context N/A

Apr 12 '24 08:04 Baibaifan

when you use --use-mcore-models,, you cannot use local. --use-flash-attn decides whether to use the OSS flash attention implmentation or cudnn implmementation.

Apr 13 '24 02:04 ethanhe42

when you use --use-mcore-models,, you cannot use local. --use-flash-attn decides whether to use the OSS flash attention implmentation or cudnn implmementation.

hi @ethanhe42 ,I understand the process you mentioned, but currently there is a task warning in the configuration options, which is not very user-friendly.

Apr 15 '24 03:04 Baibaifan

Marking as stale. No activity in 60 days.

Jun 14 '24 18:06 github-actions[bot]

Megatron-LM Megatron-LM copied to clipboard

[BUG] The bug about the options of the Megatron-core, transformer-impl and flash-attention.

Megatron-LM
Megatron-LM copied to clipboard