trl icon indicating copy to clipboard operation
trl copied to clipboard

How to use `predict` function in `DPOTrainer`

Open AIR-hl opened this issue 7 months ago • 2 comments

I want to get the logp and reward of the data through predict, but the prediction seems only include one data.

What is the correct usage of predict

image

AIR-hl avatar Jul 12 '24 06:07 AIR-hl