extract-text topic
cat
Extract text from plaintext, .docx, .odt and .rtf files. Pure go.
textract
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
fulltext
:warning: ARCHIVED :warning: Search across and get full text for OA & closed journals
tikaondotnet
Use the Java Tika text extraction library on the .NET platform
PDFs-TextExtract
Multiple and Large PDF Documents Text Extraction.
pdf-to-text
Read pdf files on javascript
tokyo
tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.