text-preprocessing topic
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
clean-text
🧹 Python package for text cleaning
texthero
Text preprocessing, representation and visualization from zero to hero.
prenlp
Preprocessing Library for Natural Language Processing
100DaysOfMLCode
Learning Machine Learning and showcasing my work for 100 Days.
normalizer
This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine trans...
python-mecab
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
panda
Panda is a Pandoc Lua filter that works on internal Pandoc's AST. Panda is heavily inspired by [abp](http:/cdelord.fr/abp) reimplemented as a Pandoc Lua filter.