Shitao Xiao comments

Results 509 comments of


                                            Shitao Xiao

BGE-M3 混合搜索如何配合向量库使用呢？

一些RAG库中即成了混合检索/多路召回，例如：https://docs.llamaindex.ai/en/stable/examples/vector_stores/PineconeIndexDemo-Hybrid.html 主要流程是多个索引分别检索top-k文本，把所有结果汇总再根据分数加权求和进行重排（或者使用bge-reranker模型进行重排）。我们进行测试的时候使用的是pyserini进行的稀疏检索，faiss进行向量检索，colbert多向量只用在了重排阶段。后续测试脚本会开源。但抱歉的是，如何在已有的流行的RAG开源库中直接使用bge-m3的多种方式，我们还没有尝试。之后会尽快与社区一起把这块做好。

使用不同方法计算得分出现torch.cuda.OutOfMemoryError: CUDA out of memory.

您好，compute_score里默认最大长度是8192。之前的代码会都padding到最大长度8192，因此会消耗很多显存。代码已更新，目前会安装输入的最大长度进行padding，可以安装最新的代码进行尝试。也可以通过设置max_passage_length来控制长度。

mkqa 是如何转换成检索数据集的

You can see the evaluation script in https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB/MKQA

compute_lexical_matching_score VS ElasticSearch

ES中应该可以导入权重，自己计算。具体在ES中实现我们目前也没有尝试，如果您有的话，非常欢迎PR。

Loss function for fine-tuning

Hi, the loss function is at https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/finetune/modeling.py#L98 Yes, we use CrossEntropyLoss with In-Batch Negative Sampling. We will use the provided negatives and if set `use_inbatch_neg` to True (default value is...

如何提升推理速度

可以使用一些加速方法，如转成onnx: https://github.com/FlagOpen/FlagEmbedding/issues/400 使用hugginface的加速库：https://github.com/huggingface/text-embeddings-inference

挖掘hard negatives命令报错

没有太多的信息，无法准确判断问题。一个可能的潜在原因是内存/显存不够了。

关于batch_size的计算方式

您好，是Device_num X per_device_batch_size，我们没有开启accumulation_steps。

多卡推理DistributedDataParallel问题

是的，DataParallel是单进程执行，有着非常频繁的GPU通信，速度远不如DistributedDataParallel。但优点是一行代码能搞定，使用非常简单。如果想提高多卡推理效率的话，建议使用DistributedDataParallel，llm-embedder代码库使用的是该方式，可以参考https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/llm_embedder/src/retrieval/modeling_dense.py#L414 。也欢迎提交PR。

多卡推理DistributedDataParallel问题

是的。一种简单的做法是把数据分成多份，手动开启多个程序，每个程序编码一部分数据。