corpus topic

List corpus repositories

awesome-persian-nlp-ir

704
Stars
113
Forks
Watchers

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

CLUE

3.9k
Stars
540
Forks
Watchers

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

NLP_bahasa_resources

446
Stars
120
Forks
Watchers

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

nlp_chinese_corpus

9.2k
Stars
1.5k
Forks
Watchers

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

Company-Names-Corpus

1.2k
Stars
374
Forks
Watchers

公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。

indonesian-NLP-resources

220
Stars
50
Forks
Watchers

data resource untuk NLP bahasa indonesia

Wordless

673
Stars
88
Forks
Watchers

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

Chinese-Names-Corpus

3.9k
Stars
976
Forks
Watchers

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

CLUEDatasetSearch

3.9k
Stars
597
Forks
Watchers

搜索所有中文NLP数据集,附常用英文NLP数据集

Chinese-NLP-Corpus

854
Stars
207
Forks
Watchers

Collections of Chinese NLP corpus