How is the "Auto CoT" prompt defined?
G-Eval includes "Auto Chain-of-Thoughts for NLG Evaluation" as a component where the CoT steps to carry out evaluation are produced by an LLM. The paper nor this repo, however, include the prompt definition. It would be convenient to have it available for both reproducibility and to extend G-Eval to other criteria.
On second read, it looks like the evaluation steps are produced via the completions API which is now considered legacy by OpenAI. Using the completions API, I am still unable to reproduce the coherence evaluation steps outlined in this prompt using ether of the GPT 3.5 models:
@calvdee What do you mean by the evaluation steps are produced via completions API? How did you find that?