evaluation
evaluation copied to clipboard
Add MNLI to Full Benchmark
coordinate with whoever is working on SuperGLUE, we only need to include MNLI once. But NLI will be held-out from model training (whereas the other SuperGLUE tasks will not) so interpreting MNLI results is different from other superglue tasks.
use to test generalization to unseen task; maybe use FLEX?
I can do it :)
@PierreColombo I'd love to help contribute to this one if you need any help!
MNLI in PS here