Shitao Xiao

Results 509 comments of Shitao Xiao

建议核查一下data/models/bge-m3里面文件是否有缺失,我们这边运行时没有问题的

是的,m3模型不需要添加指令。

可以参考https://github.com/huggingface/text-embeddings-inference

Hi, thanks for your interest in our work. You can use HuggingfaceEmbedding to load bge model in LlamaIndex. And LlamaIndex has its training script that you can use. If you...

Thanks for your interest in our work! `compute_score ` is an example to compute the hybrid scores. If you have a better implementation, welcome to submit a PR. If you...

A possible issue is the old version of transformers. You can try to upgrade the transformers.

The error seems to be related to your device. A smaller batch size may be helpful (set a smaller batch size in https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/flag_dres_model.py#L16)

负样本中存在伪负例或者和正样本太相似,会导致整体的分数下降。但只要保持正样本分数比负样本高就行。

> 额外提供一个实验中的信息,是否启用normalized参数对于模型的输出分布有很大的影响, normalized会归一化向量,使得最终计算的是余弦相似度,范围在[-1, 1]。如果设置为False,使用向量内积计算相似度,而向量内积是没有范围分布的。

目前vespa支持比较好:https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb