recommenders icon indicating copy to clipboard operation
recommenders copied to clipboard

[Question] two-tower-model + infoNCE how to optimize

Open unshaven opened this issue 1 year ago • 1 comments

I have tried a two-tower model (user and query) in a real industrial scenario using contrastive learning. The samples are all actual click samples, and the loss function is InfoNCE. I have a few questions:

  1. The model performs best with only one layer, and the more MLP layers I add, the worse the HR@100 becomes.
  2. Using L2 normalization at the end of the model degrades performance.

As a result, I currently only have one MLP layer and no normalization. Could you please provide some advice or share some experiences on what I should do?

unshaven avatar Jun 05 '24 03:06 unshaven