Shitao Xiao comments

Results 509 comments of


                                            Shitao Xiao

损失函数绘图

I think this loss curve is normal. You need to further smooth this curve to observe its trend. Besides, you can set `--report_to tensorboard ` to save the loss by...

损失函数绘图

> > I think this loss curve is normal. You need to further smooth this curve to observe its trend. Besides, you can set `--report_to tensorboard ` to save the...

bge-rerank-base用onnx部署显存持续增长不会释放，直到溢出

@ZTurboX , you can refer to https://github.com/FlagOpen/FlagEmbedding/issues/789

BGE-M3的预训练问题——loss产生偶尔上升的情况

@LLLiHaotian , you need to fine-tune the model on your downstream data, and select the best pretrain ckpt based on the downstream performance.

BGE-M3的预训练问题——loss产生偶尔上升的情况

There is no appropriate metric to evaluate the performance of pre-training task. We recommend selecting the ckpt based on the performance of fine-tuning downstream task.

> 好像有点了解了 output_1 = model.encode(sentences_1, return_dense=True, return_sparse=True, return_colbert_vecs=False) output_1['lexical_weights'])是我需要的数据 model.convert_id_to_token(output_1['lexical_weights'])是类似上面的id转token的格式对吗~ Yes

pretrain

You can pretrain m3 following this example: https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain

pretrain

yes, m3 and other models share the same pretraining script

微调bge的时候报错，求助～

This error is strange, and we don't use `save_pretrained ` for BiEncoderModel. You can try to update the FlagEmbedding and transformers package.

deepspeed训练m3模型，OOM

需要开启sub_batch_size，https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/modeling.py#L156 在比较长的长度下，能明显提高batch size。