ExplainaBoard
ExplainaBoard copied to clipboard
Evaluation for Gaokao Benchmark
- [x] multiple-choice -> accuracy
- [ ] conditional generation-based qa (hint) -> accuracy
- [ ] grammar error correction -> recall?
- [ ] conditional text generation -> human evaluation