Implement "text generation" task in the Evaluator

Open mathemakitten opened this issue 3 years ago • 0 comments

In addition to the current task types available in the Evaluator we want a generic text generation pipeline which runs inference and returns generations. The "data" the evaluator will take in this case will be (optionally) a set of prompts for the language model. This will be useful for implementing evaluations requiring a set of model generations, such as RealToxicityPrompts's "toxicity probability" and the regard metric from this paper.

Aug 05 '22 16:08 mathemakitten