FlagEmbedding Training Time of Reranker

Training Time of Reranker

Open Impavidity opened this issue 1 year ago • 5 comments

trafficstars

Thanks for the great work and the open-source models. BTW, I am quite interested in the following questions.

Total time to train the LLMReranker, such as Gemma and MiniCPM, under what kind of hardware.
Max length for the query/passage and batch size when training with LLM Reranker.

Many thanks!

May 14 '24 04:05 Impavidity

We trained for 4 days on 8 * 40G A100 GPUs. During training, the total length of query plus passage was 1024, and the batch size was 128.

May 14 '24 05:05 545999961

Thank you for you quick follow up. Sorry I have another question: How many epoch you trained on all m3+fever+quora data? Do you do any downsampling?

May 14 '24 08:05 Impavidity

Training for 1-2 epochs is enough.

May 14 '24 11:05 545999961

Thanks for the reply. Sorry I have more questions here.

for long context example (e.g. length > 1k), do we decrease the batch size during the training? If it is, is this done automatically?
During the training, is left padding used or right padding (for padding_side of tokenizer)

May 17 '24 15:05 Impavidity

May 20 '24 02:05 545999961