evals icon indicating copy to clipboard operation
evals copied to clipboard

closedqa prompt is not adequate for gpt-4-0613

Open JasonGross opened this issue 1 year ago • 0 comments

It seems that GPT-4 neglects to follow the instructions in the closedqa prompt much more than gpt-3.5-turbo. See, for example, https://github.com/openai/evals/issues/1200#issuecomment-1605238900 where gpt-4 gives 9 invalid responses out of 47, while gpt-3.5-turbo does not give any invalid responses. Does this hold across the other evals in the repo?

JasonGross avatar Jun 24 '23 02:06 JasonGross