prompt2model Benchmarking of prompt2model on composite benchmarks

Benchmarking of prompt2model on composite benchmarks

Open neubig opened this issue 1 year ago • 0 comments

Currently, we have benchmarked prompt2model extensively on three tasks (as detailed in our preprint).

But it would be much cooler if we could benchmark it on a bunch of tasks included in composite benchmarks. Examples of this include:

In order to do this, we'll have to

Aug 25 '23 13:08 neubig