RRHF icon indicating copy to clipboard operation
RRHF copied to clipboard

[NIPS2023] RRHF & Wombat

Results 23 RRHF issues
Sort by recently updated
recently updated
newest added

Hi, this is a nice work. I have some questions regarding Results in **Comparison based on Vicuna test set** section shown in README. How score A and score B are...

This is good job. However, we always use BPRLoss rather than HingeLoss in pairwise learning to rank since the margin of HingeLoss is hard to tune. So I wonder whther...

The idea of this paper is really great and much easier to understand than ppo. However, if there are six candidate responses, then at least batch size should be equal...