BLINK
BLINK copied to clipboard
optimization and loss
Hi, thanks for your code and paper. I am a fresher in EL and I have a question: bi-encoder and cross encoder are optimized jointly or separately? Specifically, loss function Eq. 4, loss function in the paragraph following Eq. 6, and loss function in Eq. 10, what is the relationship ?
@lshowway Hi, sorry for the delay in responding. The bi-encoder and cross encoder are optimized separately. For those equations, they are all independent. To be more specific: 1. Use eq.4 to train a bi-encoder. 2. Use eq.6 to train a cross-encoder. 3. Use eq.10 to train a distillation model (teacher: cross encoder, student bi-encoder). I hope that answers your question!