pdf-extractor topic
pdfsam
PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
doc_crawler.py
Explore a website recursively and download all the wanted documents (PDF, ODT…)
PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
docnet
DocNET is as fast PDF editing and reading library for modern .NET applications
python-pdftables-api
Python library to interact with https://pdftables.com API
pdf-to-txt-python
Simple pdf to text with python using PDFtk and PyPDF2
madgrades-extractor
UW-Madison course and grade distribution data extraction tool.
documind
Open-source platform for extracting structured data from documents using AI.