pd3f

Results 3 repositories owned by pd3f

pd3f

277
Stars
35
Forks
Watchers

🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based

dehyphen

37
Stars
4
Forks
Watchers

📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF

pd3f-core

34
Stars
8
Forks
Watchers

📑 Python Package to reconstruct the original continuous text from PDFs with language models