FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Retrieval and Retrieval-augmented LLMs

Results 622 FlagEmbedding issues
Sort by recently updated
recently updated
newest added

命令行如下 #!/bin/sh CUDA_VISIBLE_DEVICES=3 \ torchrun --nproc_per_node 1 \ -m FlagEmbedding.reranker.run \ --output_dir /home/dayita/model/rerank/train3 \ --model_name_or_path /llms/models/bge-reranker-large/ \ --train_data /home/dayita/muyu/testForRerank2.jsonl \ --learning_rate 1e-5 \ --gradient_checkpointing \ --fp16 \ --num_train_epochs 5 \...

Hi - First of all thanks for this great code base, it's really helpful. I've been trying to use these scripts for fine-tuning models other than BGE (e5-multilingual, I need...

您好,想问下在训练bge-embedding模型的时候,我在训练用passage去召回相关的query任务后,在使用微调后模型时,用query召回passage,发现效果不佳。但是从原理上来看,bge只是一个embedding模型,左右塔应该是对称的。想问一下这是什么原因啊

想构建一个新的vocab, 有没有推荐的方法

您好,在训练reranker的时候,需要考虑in-batch negative以及cross device negatives吗

开启deepspeed训练reranker会出现以下错误,不开启deepspeed就不会出现这个错误。 ![image](https://github.com/FlagOpen/FlagEmbedding/assets/144193886/e413d3ff-1458-41ae-842f-d125433794e7)

Hello,我的任务是query to query匹配,推理时一个query为线上真实的问句,另一个query是知识库的相似问句。 我尝试了下发现交换两个query的输入顺序,得到的分值会有略微不同,譬如 ![image](https://github.com/FlagOpen/FlagEmbedding/assets/33617887/ef5038e3-b212-447c-b5ba-8e17cff52821) 得到: ![image](https://github.com/FlagOpen/FlagEmbedding/assets/33617887/5274aa26-6b0d-4d83-90f5-ea48618be642) 请问在我的任务下,有没有一些先验的经验,应该把哪一类query放在前面呢?