corpus topic
eKeyboard
Make typing Amharic [on mobile] great [again].
sejong-corpus
Korean sejong corpus download and simple analysis
german-nouns
A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
chatterbot-corpus
A multilingual dialog corpus
FakeNewsCorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Lenta.Ru-News-Dataset
Corpus of Russian news articles collected from Lenta.Ru
EdgarAllanPoetry
Computer-generated poetry
aspen
🔎 📖 ✨ Custom, private search engine for text documents built with NextJS/React/ES6/ES7