FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

bge continue pretrain?

Open CarllllWang opened this issue 1 year ago • 1 comments

Hi, I would like to ask about incorporating additional training objectives that are beneficial to downstream tasks during the pre-training of BGE on top of the MLM task.

Specifically, my downstream task is query and passage relevance matching . I would like to integrate this process during the pre-training phase as well, in order to continued pretraining.

Is it necessary and how can it be done?

CarllllWang avatar Jul 19 '24 16:07 CarllllWang

We recommand to directly fine-tuning bge on downstream task. If you want to do continue pre-training, you can use this script https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain.

staoxiao avatar Jul 21 '24 16:07 staoxiao