Shitao Xiao
Shitao Xiao
I think this loss curve is normal. You need to further smooth this curve to observe its trend. Besides, you can set `--report_to tensorboard ` to save the loss by...
> > I think this loss curve is normal. You need to further smooth this curve to observe its trend. Besides, you can set `--report_to tensorboard ` to save the...
@ZTurboX , you can refer to https://github.com/FlagOpen/FlagEmbedding/issues/789
@LLLiHaotian , you need to fine-tune the model on your downstream data, and select the best pretrain ckpt based on the downstream performance.
There is no appropriate metric to evaluate the performance of pre-training task. We recommend selecting the ckpt based on the performance of fine-tuning downstream task.
> 好像有点了解了 output_1 = model.encode(sentences_1, return_dense=True, return_sparse=True, return_colbert_vecs=False) output_1['lexical_weights'])是我需要的数据 model.convert_id_to_token(output_1['lexical_weights'])是类似上面的id转token的格式对吗~ Yes
You can pretrain m3 following this example: https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/pretrain
yes, m3 and other models share the same pretraining script
This error is strange, and we don't use `save_pretrained ` for BiEncoderModel. You can try to update the FlagEmbedding and transformers package.
需要开启sub_batch_size,https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/modeling.py#L156 在比较长的长度下,能明显提高batch size。