Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[QUESTION] rotary position embedding

Open bugm opened this issue 8 months ago • 1 comments

hello, from the code https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/models/common/embeddings/rope_utils.py#L116

it shows when calling the _apply_rotary_pos_emb_bshd function, the behavior of MLA is different from normal GQA or MHA. The code shows for MLA, there are some extra actions to make the even dims to first half and odd dims to second half for the input Tensor. Can anyone offer some detail of the purpose for doing it? Thanks!

bugm avatar Apr 18 '25 08:04 bugm

+1

jinqinn avatar May 22 '25 02:05 jinqinn

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Jul 21 '25 18:07 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jul 29 '25 02:07 github-actions[bot]