pdftohtml topic

List pdftohtml repositories

pdf2html

138
Stars
30
Forks
Watchers

pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.

pyxpdf

39
Stars
16
Forks
Watchers

Fast and memory-efficient Python PDF Parser based on xpdf sources