text-preprocessing topic

List text-preprocessing repositories

trafilatura

3.0k
Stars
228
Forks
Watchers

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

clean-text

929
Stars
77
Forks
Watchers

🧹 Python package for text cleaning

texthero

2.9k
Stars
238
Forks
Watchers

Text preprocessing, representation and visualization from zero to hero.

prenlp

159
Stars
12
Forks
Watchers

Preprocessing Library for Natural Language Processing

texttk

19
Stars
2
Forks
Watchers

Text Preprocessing in Python

100DaysOfMLCode

16
Stars
7
Forks
Watchers

Learning Machine Learning and showcasing my work for 100 Days.

normalizer

31
Stars
7
Forks
Watchers

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine trans...

python-mecab

28
Stars
7
Forks
Watchers

A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)

panda

39
Stars
5
Forks
Watchers

Panda is a Pandoc Lua filter that works on internal Pandoc's AST. Panda is heavily inspired by [abp](http:/cdelord.fr/abp) reimplemented as a Pandoc Lua filter.

jange

17
Stars
4
Forks
Watchers

Easy NLP in Python