developer-documentation icon indicating copy to clipboard operation
developer-documentation copied to clipboard

PDF export of all pages (into one single PDF)

Open tsteur opened this issue 9 years ago • 10 comments

It would be very neat to be able to export all the pages from developer.piwik.org in one PDF. It would start with the guides followed by integration pages, the API-Reference and last but not least the changelog and maybe the links.

Maybe we could offer a download for different Piwik versions which would make it also easier to find docs for an older Piwik version as the current docs are always based on master. Ideally the links to the other guides etc would work in the PDF.

It shouldn't be too complicated to implement but would actually provide high value as it would allow people to read about various features eg on the bus etc.

tsteur avatar Jan 08 '16 01:01 tsteur

epub + mobi would be nice to have

Maybe something like http://pandoc.org/index.html could be used to generate the books

tsteur avatar Jan 08 '16 01:01 tsteur

Summary of my brainstorming about offline docs:

Markdown -> epub -> PDF (with pandoc)

Advantages:

  • ebooks will be readable on all screen sizes / ebook reader
  • relatively easy metadata (table of content)
  • technically the easiest as epub is quite similar to HTML and CSS on the web

Disadvantages:

  • epub to pdf conversion not as simple as I thought (at least in a quick test in calibre cli)
  • the docs rely on a lot of custom markdown extensions, which probably would be broken

Markdown -> LaTeX -> PDF (with pandoc and pdflatex)

Advantages:

  • most beautiful result if done right
  • things like table of content and pdf metadata are simple

Disadvantages:

  • as above a lot of custom markdown
  • markdown and LaTeX are quite different, so fully automated conversion will get difficult
  • need to write an extensive template
  • no way to create epub (apart from hacky pdf -> epub conversation)

a creative hack converting the existing websites to pdf (with Headless Browser?)

Advantages:

  • Takes advantage of existing code (-> no Markdown conversion issues)
  • (I think) least amount of work

Disadvantages:

  • It may get ugly (depending on how well the conversation works)

making existing website work offline (with Service Workers)

Advantages:

  • probably best user experience as after toggling a switch everything works the same, but offline
  • low amount of work (depending on bonus features)

Disadvantages:

  • Doesn't really solve the issue mentioned in the first comment (versioned, archivable documentation)
  • only works in modern browsers (in my opinion no issue)

General:

  • all solution (apart from offline website) don't really support links between "sites"

Findus23 avatar Dec 13 '17 16:12 Findus23

a creative hack converting the existing websites to pdf (with Headless Browser?) May get ugly to setup and especially maintain etc :)

all solution (apart from offline website) don't really support links between "sites"

not really sure what you mean here between sites?

Regarding the markdown: I think we also have some special content in markdown in there if I remember correctly see eg https://github.com/piwik/developer-documentation#supported-inline-tags-in-php-comments and https://github.com/piwik/developer-documentation#writing-guides . I think this would not work with any markdown parser (unless we maybe wrote some plugin for markdown parsers etc). Actually: We could convert that markdown to a tweaked markdown which converts eg {@link ere()} to [ere()](...).

I would probably try to find two PHP libs like Markdown to epub and markdown to pdf. But I think for now it would be enough to have just Markdown to PDF and later we could maybe add other formats if / when needed? May be good to keep things simple if there is no easy solution out there yet.

tsteur avatar Dec 13 '17 19:12 tsteur

May get ugly to setup and especially maintain etc :) That's why it's a creative hack ;) But I'm not sure if the other solutions are really less work.

not really sure what you mean here between sites? I interpreted Ideally the links to the other guides etc would work in the PDF. that hyperlinks inside the PDF should link to the corresponding page inside the same document. I can't think of an easy way to achieve this.

We could write a parser that converts the custom functions to normal markdown, but that would just duplicate the existing code. And if we use the existing Markdown -> HTML conversion, we are at a point where the best HTML renderer is a browser.

I don't think there are many libraries for converting Markdown to PDF (even via intermediates and if we don't limit us to PHP) Pandoc is the only software I know that does support a lot of file formats.

Those are the results of pandoc for a single markdown file to pdf via LaTeX

directy to epub

this epub to pdf via some online converter (It seems to be using calibre as it has the same ugly scrollbars)

Findus23 avatar Dec 13 '17 20:12 Findus23

The code blocks would need to be styled for "print" I reckon.

What about just having one page that renders all those pages as HTML in the browser... then people could print it as PDF themselves with the built-in browser feature. I reckon this might "do" for now as users haven't really asked for it AFAIK and it is more of a nice to have feature. Next step we could make that one-page available offline. We could also use a nice theme that looks more like a "book" for that one page so it looks nice when printing it and maybe at some point we will be able to convert it to epub?

tsteur avatar Dec 13 '17 20:12 tsteur

(Small note: whatever solution we use here, would be great if we could somehow use it later also for our user guides on Piwik.org so we could generate a "book" for all of Piwik, refs https://github.com/piwik/piwik/issues/5315 )

mattab avatar Dec 13 '17 20:12 mattab

That seems like a good compromise idea. But I am not sure if we should include all API documentation as docs/3.x/**/*.md results in a 1300 page long pdf (via the same online converter as above)

Findus23 avatar Dec 13 '17 20:12 Findus23

yeah I would possibly exclude the classes reference and maybe, if easily doable, package this separate in another "one-page". Alternatively, we could even package "Integrate + Develop + Piwik In Depth" into one document, and all "API-References" in another document.

tsteur avatar Dec 14 '17 03:12 tsteur

any progress on this topic?

mxdpeep avatar Jun 05 '18 21:06 mxdpeep

I was thinking about this over the weekend and thought it could be cool if there was an e-book version of the docs that people could read through offline as well, and then I came to find there is an issue for this already.

michalkleiner avatar May 08 '23 07:05 michalkleiner