IndoNLP
IndoNLP
indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
nusa-crowd
A collaborative project to collect datasets in Indonesian languages.
indonlg
The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code! (E...
nusax
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)
nusa-writes
NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.