pdf-to-text topic

List pdf-to-text repositories

pd3f

277
Stars
35
Forks
Watchers

🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based

SciTSR

331
Stars
56
Forks
Watchers

Table structure recognition dataset of the paper: Complicated Table Structure Recognition

PDF-TOOLBOX

80
Stars
9
Forks
Watchers

A Multi Purpose PDF Toolkit

converter

40
Stars
12
Forks
Watchers

Standalone .NET Converter library, not require Adobe Acrobat component nor Microsoft Office Interop Assemblies, to convert PDF, DOCX, XLSX, HTML, Image, CSV, RTF, TXT in .NET framework

pdf-text-extraction

62
Stars
16
Forks
Watchers

cli for extracting text from PDF files (and maybe possibly tables)

Docotic.Pdf.Samples

69
Stars
39
Forks
Watchers

C# and VB.NET samples for Docotic.Pdf library

php-pdf-2-text

24
Stars
17
Forks
Watchers

Simple PHP PDF to Text class

unstructured

8.6k
Stars
702
Forks
43
Watchers

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Extract-Data-From-PDF-In-Python

25
Stars
12
Forks
Watchers

Batch-convert pdf to text, extract data from pdf in python

nocodefunctions-web-app

34
Stars
5
Forks
Watchers

The code base of the front-end of nocodefunctions.com