Baichuan-7B icon indicating copy to clipboard operation
Baichuan-7B copied to clipboard

[Question] RoPE的实现和论文里不一致

Open zehmaaa opened this issue 1 year ago • 1 comments

Required prerequisites

Questions

请问这里 的实现为啥和论文里面不一样?

def rotate_half(x):
    """Rotates half the hidden dims of the input."""
    x1 = x[..., : x.shape[-1] // 2]
    x2 = x[..., x.shape[-1] // 2:]
    return torch.cat((-x2, x1), dim=-1)

论文里的计算是 image

按照这种实现最后的计算结果会是 image

我看huggingface里面也是这样,好奇为啥选择这种实现?

Checklist

  • [X] I have provided all relevant and necessary information above.
  • [X] I have chosen a suitable title for this issue.

zehmaaa avatar Oct 04 '23 08:10 zehmaaa

embedding 里面神经元的位置是没有顺序的,随便选一半做反转就行了;

xinge333 avatar Jul 03 '24 06:07 xinge333