FlagEmbedding issues

resume_from_checkpoint似乎无法继续从checkpoint恢复训练，是否有其他方法恢复？

1

训练bge-reranker-large时，加入参数--gradient_checkpointing 后报错

1

命令行如下 #!/bin/sh CUDA_VISIBLE_DEVICES=3 \ torchrun --nproc_per_node 1 \ -m FlagEmbedding.reranker.run \ --output_dir /home/dayita/model/rerank/train3 \ --model_name_or_path /llms/models/bge-reranker-large/ \ --train_data /home/dayita/muyu/testForRerank2.jsonl \ --learning_rate 1e-5 \ --gradient_checkpointing \ --fp16 \ --num_train_epochs 5 \...

dadyita

微调阶段batch_size=19200是否意味着负样本数量也是19000+？还是只是根据train_group_size进行设置的值。

7

如title。

sunzhaoyang1

Fine-tune other models + pooling & vector normalization question

5

Hi - First of all thanks for this great code base, it's really helpful. I've been trying to use these scripts for fine-tuning models other than BGE (e5-multilingual, I need...

netapy

bge训练切换左右塔

3

您好，想问下在训练bge-embedding模型的时候，我在训练用passage去召回相关的query任务后，在使用微调后模型时，用query召回passage,发现效果不佳。但是从原理上来看，bge只是一个embedding模型，左右塔应该是对称的。想问一下这是什么原因啊

zhaobinNF

怎样构建自己的vocab

1

想构建一个新的vocab，有没有推荐的方法

xzdong-2019

reranker训练时的in batch negative

6

您好，在训练reranker的时候，需要考虑in-batch negative以及cross device negatives吗

zhaobinNF

使用reranker模型计算的score是负数，负数代表什么含义？正常吗

1

负数代表什么含义？负数是正常的吗

mechigonft

AttributeError: 'CrossEncoder' object has no attribute 'config'

2

开启deepspeed训练reranker会出现以下错误，不开启deepspeed就不会出现这个错误。 ![image](https://github.com/FlagOpen/FlagEmbedding/assets/144193886/e413d3ff-1458-41ae-842f-d125433794e7)

LaoWangGB

ReRanker 输入两段文本的顺序问题

2

Hello，我的任务是query to query匹配，推理时一个query为线上真实的问句，另一个query是知识库的相似问句。我尝试了下发现交换两个query的输入顺序，得到的分值会有略微不同，譬如 ![image](https://github.com/FlagOpen/FlagEmbedding/assets/33617887/ef5038e3-b212-447c-b5ba-8e17cff52821) 得到： ![image](https://github.com/FlagOpen/FlagEmbedding/assets/33617887/5274aa26-6b0d-4d83-90f5-ea48618be642) 请问在我的任务下，有没有一些先验的经验，应该把哪一类query放在前面呢？

LexieeWei

FlagEmbedding
FlagEmbedding copied to clipboard

Metadata

resume_from_checkpoint似乎无法继续从checkpoint恢复训练，是否有其他方法恢复？

训练bge-reranker-large时，加入参数--gradient_checkpointing 后报错

微调阶段batch_size=19200是否意味着负样本数量也是19000+？还是只是根据train_group_size进行设置的值。

Fine-tune other models + pooling & vector normalization question

bge训练切换左右塔

怎样构建自己的vocab

reranker训练时的in batch negative

使用reranker模型计算的score是负数，负数代表什么含义？正常吗

AttributeError: 'CrossEncoder' object has no attribute 'config'

ReRanker 输入两段文本的顺序问题

← Metadata

Owner

Metadata

FlagEmbedding FlagEmbedding copied to clipboard

Metadata

← Metadata

Owner

Metadata

FlagEmbedding
FlagEmbedding copied to clipboard