Megatron-LM add core_attention_bias_type to TransformerConfig

add core_attention_bias_type to TransformerConfig

Open ethanhe42 opened this issue 5 months ago • 2 comments

core_attention_bias_type is needed to use alibi from transformer engine https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/api/pytorch.html?highlight=alibi#transformer_engine.pytorch.DotProductAttention.forward

Jan 21 '24 10:01 ethanhe42

Megatron-LM Megatron-LM copied to clipboard

add core_attention_bias_type to TransformerConfig

Megatron-LM
Megatron-LM copied to clipboard