unilm icon indicating copy to clipboard operation
unilm copied to clipboard

The project of E5: How to fine-tune E5 model on NLI task?

Open MatanAvitan opened this issue 1 year ago • 3 comments

Describe Model I am using is E5:

I have several questions regarding fine-tuning the E5 model on the NLI task.

  1. Should I add passage: to the premise and query: to the hypothesis? (as it's an asymmetric task) or the other way around? or maybe just add query: as the second token (after <s>)? (regardless of the position of the premise/hypothesis).

Currently I'm fine-tuning with the following format: <s> passage: premise </s><s> query: hypothesis </s>

Would be happy to know if it is the correct way to do it.

  1. Do the training scripts of E5 are publicly open? couldn't find them.

  2. When fine-tuning on different tasks, did you just stack a proper head on top of the current pooler? The pooler I referred to: (pooler): XLMRobertaPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() )

If so, where can I find the different heads' weights for the different fine-tuned tasks? I guess they are not very important but may be helpful.

Thanks in advance!

MatanAvitan avatar Dec 10 '23 14:12 MatanAvitan

@intfloat Hi :) can you assist please?

MatanAvitan avatar Dec 17 '23 09:12 MatanAvitan

Hi @MatanAvitan ,

Thanks for the questions.

  1. Should I add passage: to the premise and query: to the hypothesis?

Although NLI is technically an asymmetric task, we follow the SimCSE paper and treat it as a symmetric task. During the training, 50% of the time we add passage: to the premise and query: to the hypothesis, and 50% for the other way around.

  1. Do the training scripts of E5 are publicly open?

Unfortunately, the training scripts are not publicly available. Our code is based on https://github.com/microsoft/unilm/tree/master/simlm with some changes to support custom prefixes. The released E5 checkpoints are supposed to be good embedding models without any further training. If you would like to fine-tune them, you can use existing libraries such as Tevatron by changing the initialization.

  1. When fine-tuning on different tasks, did you just stack a proper head on top of the current pooler?

I am not sure about your question. We do not fine-tune it on different tasks. Instead, we pre-train and fine-tune it on a mixture of data jointly, and then evaluate it on different tasks without any further fine-tuning.

intfloat avatar Dec 18 '23 02:12 intfloat

Thank you for your questions and contributions @MatanAvitan.

I would like to fine-tune the e5-multilingual-base or e5-multilingual-large embedding model. Could you share the code you used with me, or do you have any recommendations?

4entertainment avatar May 29 '24 08:05 4entertainment