text-extraction topic

List text-extraction repositories

wikipedia_ner

68
Stars
7
Forks
Watchers

:book: Labeled examples from wiki dumps in Python

any-text

58
Stars
9
Forks
Watchers

Get text content from any file

tokyo

18
Stars
0
Forks
Watchers

tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.

textextractor2.0

18
Stars
11
Forks
Watchers

:fire: This web app extracts text in an image.

pdf-text-extraction-benchmark

61
Stars
11
Forks
Watchers

A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF documents, especially from scientific articles.

mobi

55
Stars
8
Forks
Watchers

python based software to unpack kindlegen generated ebooks

mirusan

19
Stars
1
Forks
Watchers

A PDF collection reader with built-in full-text search engine

pnlp

28
Stars
7
Forks
Watchers

NLP预/后处理工具。