FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Training Time of Reranker

Open Impavidity opened this issue 1 year ago • 5 comments
trafficstars

Thanks for the great work and the open-source models. BTW, I am quite interested in the following questions.

  1. Total time to train the LLMReranker, such as Gemma and MiniCPM, under what kind of hardware.
  2. Max length for the query/passage and batch size when training with LLM Reranker.

Many thanks!

Impavidity avatar May 14 '24 04:05 Impavidity

We trained for 4 days on 8 * 40G A100 GPUs. During training, the total length of query plus passage was 1024, and the batch size was 128.

545999961 avatar May 14 '24 05:05 545999961

Thank you for you quick follow up. Sorry I have another question: How many epoch you trained on all m3+fever+quora data? Do you do any downsampling?

Impavidity avatar May 14 '24 08:05 Impavidity

Training for 1-2 epochs is enough.

545999961 avatar May 14 '24 11:05 545999961

Thanks for the reply. Sorry I have more questions here.

  1. for long context example (e.g. length > 1k), do we decrease the batch size during the training? If it is, is this done automatically?
  2. During the training, is left padding used or right padding (for padding_side of tokenizer)

Impavidity avatar May 17 '24 15:05 Impavidity

  1. It will truncate long contexts, so there is no need to decrease the batch size.
  2. Follow the tokenizer's raw padding side.

545999961 avatar May 20 '24 02:05 545999961