FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Retrieval and Retrieval-augmented LLMs

Results 622 FlagEmbedding issues
Sort by recently updated
recently updated
newest added

Hi - I'm trying to download the en and zh data from [this page](https://data.baai.ac.cn/details/BAAI-MTP). However, it keeps asking me to scan a WeChat login and then doesn't work. Is there...

This is 3 texts: ![文本1](https://github.com/user-attachments/assets/db7304ac-9d91-48b1-8cf2-207fb6c9a7df) ![文本2](https://github.com/user-attachments/assets/3c60a5bc-10b8-42c2-87bd-517b6dca2c09) ![文本3](https://github.com/user-attachments/assets/19d00fed-70d2-4451-83be-3dd3aebe91b8) Loading with FlagModel Comparison of vector results of text 1 and 2: ![使用FlagModel加载](https://github.com/user-attachments/assets/4bb58824-01d0-4348-9e3f-c601251abda0) ![文本1-2向量对比](https://github.com/user-attachments/assets/03d026b3-0d3a-4d3b-af08-d04c3b6e5a3c) Loading using SentenceTransformer Comparison of vector results of text...

使用FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py 按照文件格式构建了自己的语料库和查询但是 {'MRR@1': 0.0, 'MRR@10': 0.0, 'MRR@100': 0.0, 'Recall@1': 0.0, 'Recall@10': 0.0, 'Recall@100': 0.0} 另外部分微调的数据为 {"query": "补骨脂 对人体的哪些脏腑有作用,具体作用是什么?", "pos": ["入脾命门心包三经。为壮火益土之品。(补相火以通君火)"], "neg": ["主温中。心腹痛。呕吐。去口臭气。(别录) 下气。止霍乱。一切冷气。消 酒 毒。 吐酸。", "入肝肾二经。为冲和之品。(兼补剂 能引肺金之气入肾)", "回春曰。河水与井水合用。亦名阴阳水。 以上宣剂水部",...

In FlagEmbedding, hard negative mining is extracted based on ranking. (FlagEmbedding/baai_general_embedding/finetune/hn_mine.py) Is there a code that does hard negative mining based on similarity score?

Dear Authors, Firstly, thank you for your insightful paper, "Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval." I found it highly informative and am excited about its potential...

Is it possible to compute the colbert score for the m3 model for more than one pair at the time? currently seems very inneficient

I have fine-tuned the reranker through this github. I would like to continue learning from the saved checkpoint. Can you give me some instructions on how to do that?

Recently, I completed a RAG system project, and I want to use the three retrieval methods in bge-m3. However, currently, when using BGEM3FlagModel () to load the model, errors will...

![image](https://github.com/user-attachments/assets/db78465d-7f49-4ef6-b830-189b3f06283c) 参数以下: --learning_rate 3e-5 \ --fp16 \ --num_train_epochs 2 \ --per_device_train_batch_size 4 \ --dataloader_drop_last True \ --normlized False \ --temperature 0.02 \ --query_max_len 512 \ --passage_max_len 512 \ --train_group_size 6...

我看训练参数中max_len设置为512,我现在希望可以微调bge_reranker_large模型,希望能达到2k的长度,我直接修改max_len为2k就行了嘛,但我发现训练完之后模型config中max_position_embeddings还是512+2,想问下如果想微调到2k的话 还应该修改哪个参数呢