RRHF issues

Results on Comparison based on Vicuna test set

1

Hi, this is a nice work. I have some questions regarding Results in **Comparison based on Vicuna test set** section shown in README. How score A and score B are...

LeeShiyang

Why use HingeLoss instead of BPRLoss ?

1

This is good job. However, we always use BPRLoss rather than HingeLoss in pairwise learning to rank since the margin of HingeLoss is hard to tune. So I wonder whther...

KID-22

This loss seems to consume a lot of memory.

4

The idea of this paper is really great and much easier to understand than ppo. However, if there are six candidate responses, then at least batch size should be equal...

piekey1994

Runtime error：数据类型报错

作者好，我在复现RRHF时碰到变量类型报错：我配置fsdp_config进行分布式训练，当我使用--bf16混合精度时，报错： return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while...

sqqiao

RRHF with Online Sampling

1

感谢作者的工作，请问是否可以分享RRHF-Online Sampling的相关代码，想做一下复现实验

sqqiao

resize embedding after add_special_tokens

Hi, thanks for your great work! I would like to point out a potential bug in this code: add_special_tokens without checking embedding size is very dangerous especially for llama. In...

Switchsyj

RRHF
RRHF copied to clipboard

Metadata

Results on Comparison based on Vicuna test set

Why use HingeLoss instead of BPRLoss ?

This loss seems to consume a lot of memory.

Runtime error：数据类型报错

RRHF with Online Sampling

resize embedding after add_special_tokens

← Metadata

Owner

Metadata

RRHF RRHF copied to clipboard

Metadata

Results on Comparison based on Vicuna test set

Why use HingeLoss instead of BPRLoss ?

This loss seems to consume a lot of memory.

Runtime error：数据类型报错

RRHF with Online Sampling

resize embedding after add_special_tokens

← Metadata

Owner

Metadata

RRHF
RRHF copied to clipboard