alignment-handbook Question about AI Feedback (AIF)

Question about AI Feedback (AIF)

Open HaoruSung opened this issue 1 year ago • 0 comments

In the AI Feedback (AIF) phase, with GPT-4 serving as the teacher model,I am curious to know if there might be a propensity for GPT-4 to assign higher ratings to its own outputs？

Additionally, I am interested in the statistical distribution of various large language models chosen as ${y_w}$ during the AI Feedback (AIF) evaluation in your study. Have you conducted an analysis on how frequently different LLMs were selected for this purpose?

Thank you!

Jan 05 '24 04:01 HaoruSung

alignment-handbook alignment-handbook copied to clipboard

Question about AI Feedback (AIF)

alignment-handbook
alignment-handbook copied to clipboard