pystitcher icon indicating copy to clipboard operation
pystitcher copied to clipboard

Local HTML -> PDF rendering

Open captn3m0 opened this issue 4 years ago • 5 comments

# Title

- [chapter 1](chapter1.html)
- [chapter 2](chapter2.html)

Convert the HTML to PDFs and merge accordingly.

captn3m0 avatar May 26 '21 15:05 captn3m0

Notes from PDF rendering research in Python:

  • The best results come from running pandoc, but that comes with latex dependencies, which we want to avoid.
  • The next best tools are:
    • wkhtml2pdf, which also has installation concerns. I don't like the render quality either.
    • xhtml2pdf which uses PyPDF2 (😞) and reportlab underneath.
    • Directly using reportlab, but this has a few native dependencies that it comes bundled with. (libfreetype, libpng, libz)
    • And finally, rinohtype, which is pure-python but is heavily under development and is missing some important features for now (such as remote images)
    • weasyprint, which I haven't tried much yet.

Did some experiments with all of the above to get close-to-pandoc typography. Will add some more details here.

captn3m0 avatar Jun 27 '21 14:06 captn3m0

I tried out Weasyprint and the results look promising to me, but it does have a couple of external dependencies, including Pango. However there do seem to be plans to minimize the external dependencies

Vonter avatar Jan 02 '22 05:01 Vonter

My recommended options are (not in any order):

  • reportlab (dependencies are bundled in the wheel, so not much concerns)
  • xhtml2pdf (But with a fix for https://github.com/xhtml2pdf/xhtml2pdf/issues/560)
  • rinohtype, which is pre-stable currently, but holds good promise. Even without remote images, I think it will make for a good start.

captn3m0 avatar Jan 02 '22 15:01 captn3m0

I filed a PR to xhtml2pdf for PyPDF3 support: https://github.com/xhtml2pdf/xhtml2pdf/pull/582.

captn3m0 avatar Jan 18 '22 11:01 captn3m0