pdftotext topic

List pdftotext repositories

cat

90
Stars
18
Forks
Watchers

Extract text from plaintext, .docx, .odt and .rtf files. Pure go.

Iron-OCR-Image-to-Text-in-CSharp

72
Stars
17
Forks
Watchers

Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/

pdf-to-text

76
Stars
33
Forks
Watchers

Read pdf files on javascript

pyxpdf

38
Stars
16
Forks
Watchers

Fast and memory-efficient Python PDF Parser based on xpdf sources

pdf2dataset

17
Stars
3
Forks
Watchers

Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features

Extract-Data-From-PDF-In-Python

25
Stars
11
Forks
Watchers

Batch-convert pdf to text, extract data from pdf in python

aiopytesseract

17
Stars
6
Forks
Watchers

A Python asyncio wrapper for Tesseract-OCR.