FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

请问BGE-M3中的multi-Granularity中的最大文档长度8192tokens是怎么实现的

Open chengzi-big opened this issue 1 year ago • 4 comments

chengzi-big avatar Apr 30 '24 03:04 chengzi-big

We pre-train and fine-tune bge-m3 one long texts. You can refer to our paper: https://arxiv.org/abs/2402.03216

staoxiao avatar Apr 30 '24 14:04 staoxiao

我的意思是基于transformer的模型输入长度最大不超过512tokens,BGE-M3模型是如何将输入扩展到8192个tokens的

chengzi-big avatar May 02 '24 03:05 chengzi-big

@chengzi-big , the transformer architecture does not have a length limit. The limitation of length comes from the positional encoding. We use the absolute positional encoding with a length of 8192.

staoxiao avatar May 02 '24 14:05 staoxiao

谢谢你的回复,你的答案对我帮助很大。

chengzi-big avatar May 03 '24 02:05 chengzi-big

@chengzi-big , the transformer architecture does not have a length limit. The limitation of length comes from the positional encoding. We use the absolute positional encoding with a length of 8192.

@staoxiao @namespace-Pt 请问是用的什么绝对位置编码?你们没有用RoPE吗。是一起参与训练的绝对位置编码吗,怎么初始化的? 能分享些细节吗,我仔细看了bge-m3的paper似乎没有提到这方面。

shuiyigt avatar May 31 '24 08:05 shuiyigt