FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

关于finetune

Open DarrenZhu1103 opened this issue 1 year ago • 1 comments

您好,我想将bge-m3用于视频检索场景,构造hard negative的方式是用原始的m3模型的dense embedding召回与query最相似标题,通过gpt-4做打标,选取标注结果不相关的样本,目前构造了80k的[query, pos, neg]样本像用于finetune 请问下finetune有推荐的参数设置么,比如batch_size,learning rate,temperature,看到原文里好像没提到

DarrenZhu1103 avatar Feb 27 '24 14:02 DarrenZhu1103

参考bgev1.5的微调参数就好:https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune#3-train batch_size尽量大一些,learning rate=1e-5 or 5e-6, temperature=0.02

staoxiao avatar Feb 28 '24 02:02 staoxiao