FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

bge-reranker-large微调报错

Open libingbingd opened this issue 2 years ago • 1 comments

您好,我这在微调bge-reranker-large的时候一直在报两个错,这是什么原因呢: 46347d54-5deb-4224-b6e2-5ccb43b06e0f 556a9365-5f38-49be-ac90-8efa21ac88e5 我的显存是30G,训练bge-large-zh都可以; 参数如下: python ./run.py --model_name_or_path='/oss/model/bge-reranker-large'
--output_dir='/oss/model/bge-reranker-large-ft-lbb/1.0.0'
--train_data='/oss/data/bge-large-zh-lbb-data/train_pos_top30.jsonl'
--num_train_epochs=2
--learning_rate 6e-5
--fp16
--per_device_train_batch_size 10
--logging_steps 100

libingbingd avatar Dec 28 '23 08:12 libingbingd

超出显存了, 可以减小train_group_size或者,减小per_device_train_batch_size(可以通过提高gradient_accumulation_steps来维持batch size)

staoxiao avatar Dec 28 '23 11:12 staoxiao