promptsource icon indicating copy to clipboard operation
promptsource copied to clipboard

Add more non-English datasets: IndoNLU

Open gentaiscool opened this issue 2 years ago • 0 comments

I would like to propose adding new Indonesian datasets from the IndoNLU benchmark https://github.com/IndoNLP/indonlu for multilingual evaluation. This benchmark has various tasks: sentiment analysis, emotion classification, and textual entailment. Probably, we can start with one task like sentiment analysis.

I was wondering if we are still able to add new non-English datasets since, currently, we only have low coverage on non-English tasks.

if it is okay, I would be very happy to assign myself to add those :)

gentaiscool avatar Jun 13 '22 00:06 gentaiscool