FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Retrieval and Retrieval-augmented LLMs

Results 622 FlagEmbedding issues
Sort by recently updated
recently updated
newest added

What are the hardware configuration requirements for running the bge-m3 model? example: GPU model &VRAM& Memory

File "/data/FlagEmbedding/FlagEmbedding/bge_m3.py", line 6, in import datasets ModuleNotFoundError: No module named 'datasets'

bge-m3的稀疏向量应该是通过分词得到的吧,未来有可能尝试引入SPLADE方法吗?目前还没看到开源的中文SPLADE模型,不知道效果咋样

https://github.com/FlagOpen/FlagEmbedding/blob/bd38bd350054d0dba39ea8d602afac1fab141b35/FlagEmbedding/reranker/data.py#L42 代码中padding=False item = self.tokenizer.encode_plus( qry_encoding, doc_encoding, truncation=True, max_length=self.args.max_len, padding=False, ) 但是参数这里又说的是会pad。所以实际训练的时候,是padding了吗? max_len: int = field( default=512, metadata={ "help": "The maximum total input sequence length after tokenization for input text....

If it was not released publicly yet, when it will be avaiable? Thanks a lot.

Some weights of XLMRobertaForSequenceClassification were not initialized from the model checkpoint at /bge/FlagEmbedding/examples/reranker/sft_model/0221-4/merged and are newly initialized: ['classifier.out_proj.bias', 'classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.weight'] You should probably TRAIN this model on a down-stream...

您好我想咨询下,BGE-m3支持大文档的内容检索吗? 我想做个智能客服,知识库是以文档的形式存储,拿用户的query在文档中检索出某一句作为答案回复,支持这样吗? 我理解文档中提到的检索,还是两个句子的相似度检索,并不能直接从一个文档中提取出某句话来回答用户的问题;

关于解决 #464 的修改。 避免困难样本挖掘时,当召回负样本数量少于预设负采样数量,会随机采样到正样本、或重复采样负样本的问题。 修改为,默认从 `corpus` 中剔除正例和已召回的负例,再进行随机采样;若剔除后 `corpus` 为空,说明需要重复采样负样本才能满足负采样数量要求,则只剔除正样本、重复采样负样本即可。

您好 我在微调bge 1.5的过程中,想在训练的过程中同时加入评估集,然后根据评估集合的loss去保存最优模型,于是我新增了一下一个参数 eval_dataset=eval_dataset, evaluation_strategy "epoch", save_strategy "epoch", save_total_limit 3 load_best_model_at_end True 但是结果遇到了报错,在评估的过程中找不到对应的loss Traceback (most recent call last): File "C:\Users\Lichengyang\Desktop\FlagEmbedding-master\FlagEmbedding-master\FlagEmbedding\baai_general_embedding\finetune\run.py", line 116, in main() File "C:\Users\Lichengyang\Desktop\FlagEmbedding-master\FlagEmbedding-master\FlagEmbedding\baai_general_embedding\finetune\run.py", line 102, in main...