rl-prompt Repeating tokens in optimized prompt

Repeating tokens in optimized prompt

Open AMJasser opened this issue 8 months ago • 0 comments

Hello there, I am working on an application of your work in another setting that is not related to text style transfer or classification. During evaluation, the model almost always gives repeating tokens like ['Private', 'Private', 'Private', 'Private', 'Private', 'Private'] or ['Policy', 'Policy', 'Policy', 'Policy', 'Policy', 'Policy']. How can I improve on the performance model? I'd love to get your expert insights on important hyperparameters I can play with to achieve better results.

Jun 10 '24 21:06 AMJasser

rl-prompt rl-prompt copied to clipboard

Repeating tokens in optimized prompt

rl-prompt
rl-prompt copied to clipboard