openai-cookbook icon indicating copy to clipboard operation
openai-cookbook copied to clipboard

Add LLM-as-a-Judge guide

Open ankrgyl opened this issue 1 year ago • 0 comments
trafficstars

Summary

This new cookbook example walks through how to create a custom LLM-as-a-judge, and in particular explores (a) different methods (numeric rating vs. classification) and (b) how to evaluate each method's effectiveness.

Motivation

Creating good LLM-as-a-judge scorers is an important challenge, and there's not a lot of content out there on how to do it. It's also important to eval and measure them just like you would any AI application, so hopefully this example can help folks understand how to do that.


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • [x] I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • [x] I have conducted a self-review of my content based on the contribution guidelines:
    • [x] Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • [x] Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • [x] Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • [x] Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • [x] Correctness: The information I include is correct and all of my code executes successfully.
    • [x] Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

ankrgyl avatar Oct 14 '24 20:10 ankrgyl