BLINK icon indicating copy to clipboard operation
BLINK copied to clipboard

optimization and loss

Open lshowway opened this issue 3 years ago • 1 comments

Hi, thanks for your code and paper. I am a fresher in EL and I have a question: bi-encoder and cross encoder are optimized jointly or separately? Specifically, loss function Eq. 4, loss function in the paragraph following Eq. 6, and loss function in Eq. 10, what is the relationship ?

lshowway avatar Nov 30 '21 06:11 lshowway

@lshowway Hi, sorry for the delay in responding. The bi-encoder and cross encoder are optimized separately. For those equations, they are all independent. To be more specific: 1. Use eq.4 to train a bi-encoder. 2. Use eq.6 to train a cross-encoder. 3. Use eq.10 to train a distillation model (teacher: cross encoder, student bi-encoder). I hope that answers your question!

ledw-2 avatar Feb 06 '22 07:02 ledw-2