recommenders
recommenders copied to clipboard
[Question] When to use num_hard_negatives?
Intuitively, by using hard negatives, we are trying to push away random negatives with high logits away from the true positive. Since the negatives are random, isn't this forcing model at t+1 to be drastically different than the model at time t?
Also, Both of the seminal two-tower retrieval papers[1,2] don't mention any use of hard negative in the paper. Any guidance or insight on when they are useful and when they are not?
- Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations
- Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations
Both implementations are based on taking the hardest negatives in the in-batch. Only positive samples are fed into the model. As the 2nd stage, the most difficult samples forward to softmax.