nlp-datasets topic
HistSumm
Code and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
nlp-library
curated collection of papers for the nlp practitioner 📖👩🔬
multi-task-NLP
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
kartaslov
Открытые лингвистические датасеты: тональный словарь русского языка КартаСловСент, датасет по семантике, ассоциативный граф и датасет по орфографическим ошибкам и опечаткам.
ua-gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
TriggerNER
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
Awesome-Indonesia-NLP
Resource NLP & Bahasa
CommonGen
A Constrained Text Generation Challenge Towards Generative Commonsense Reasoning
VDCNN
Implementation of Very Deep Convolutional Neural Network for Text Classification
nlp-public-dataset
Chinese, English NER, English-Chinese machine translation dataset. 中英文实体识别数据集,中英文机器翻译数据集, 中文分词数据集