pdf-to-json topic

List pdf-to-json repositories

unstructured

8.6k
Stars
702
Forks
Watchers

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

statement-parser

30
Stars
5
Forks
Watchers

Parse bank and credit card statements

ocr-python

74
Stars
11
Forks
Watchers

OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.

PDF-Verse

186
Stars
52
Forks
Watchers

PDF Verse is a powerful web based PDF Editor with tools for editing, converting, and manipulating PDFs. Merge, compress, add or remove pages, or extract text using OCR technology. Convert PDF to DOC,...

docling

22.8k
Stars
1.3k
Forks
Watchers

Get your documents ready for gen AI

docstrange

1.0k
Stars
98
Forks
1.0k
Watchers

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.