docs.scala-lang icon indicating copy to clipboard operation
docs.scala-lang copied to clipboard

Automatically generate PDF, ePub, and MOBI versions of Scala Book

Open alvinj opened this issue 5 years ago • 5 comments

This is the start of some notes on how to automatically create PDF, ePub, and MOBI versions of Scala Book. I have already generated first versions of these documents with a manual process (excluding a couple of problems noted below), and this is a writeup of how to automate the process.

To create an ePub document

  • Copy all markdown files and LIST_OF_FILES_IN_ORDER from the website directory (_overviews/scala-book) to a working directory
  • Add a # title tag to each *.md file
    • get the title from the header section
    • prepend a chapter number to each title
  • Remove the header content from all *.md files
  • Transform all <pre> sections to use only four backticks
    • all “fenced code blocks” need to be transformed (scala, java, etc.) to use four backticks
    • actually, I know this is necessary for the PDF, but it might not be necessary for the ePub and MOBI versions
  • Generate a Pandoc command that includes all *.md files in the proper order; this command looks like this:
pandoc -o ScalaBook.epub \
    metadata.txt \
    working/introduction.md \
    working/prelude-taste-of-scala.md \
    working/preliminaries.md \
    50 more lines here ...

I have code to do all of that, I just need some free time to clean it up and automate it.

To create a MOBI document

There are more elaborate ways to do this, but at the moment this command seems to work, generating the MOBI document from the ePub:

kindlegen ScalaBook.epub

It looks like KindleGen is available for Linux, MacOS, and Windows:

To create a PDF document

I currently have a way to do this, but it will take a little time to automate. The first few steps in the process are similar to the ePub process, but you don’t need to add a chapter number:

  • Copy all markdown files and LIST_OF_FILES_IN_ORDER from the website directory (_overviews/scala-book) to a working directory
  • Add a # title tag to each *.md file
    • get the title from the header section
  • Remove the header content from all *.md files
  • Transform all <pre> sections to use only four backticks

After that the steps are:

  • Convert all of the Markdown files to LaTeX files
  • Generate the PDF using a latexmk command
    • Historically I have manually worked through any issues that come up at this time

Known problems

Generating the PDF

  • LaTeX doesn’t like the trick I used with the “Prelude” title, so that has to be replaced
  • The PDF-generating process gets stuck on a line somewhere near this:
'', k, v) val keys = m.keys val values = m.values val contains3 =

I haven’t had the time to look into that yet.

Generating the ePub document

  • This process fails on the tables in the following files, so this needs to be looked into:
    • built-in-types.md
    • collections-101.md
  • The {::comment} syntax shows up in the MOBI document, so it’s probably also in the ePub
    • I’ll submit a pull request to delete all comments from Scala Book

Tools

So far my “tools” for generating these documents are:

  • Unix shell scripts, including sed commands
  • Some custom Scala scripts
    • I wrote these to remove the Markdown header content, and add # title tags to the resulting Markdown files
  • Pandoc
    • This is used to generate the ePub and MOBI versions
  • LaTeX
    • This is used to generate the PDF
    • I use a Mac, and I think I installed the tools (several years ago) with MacTeX

What I need

Mostly all I need to complete this process is some free time on my part, and then I just need to know that the tools listed will be available on the server. Assuming I can work through the problems listed, the whole process is really:

  1. Copy the website Markdown files to a working directory
  2. Transform their header sections
  3. Convert the *.md files to *.tex files, and generate the PDF with the latexmk command
  4. Generate the ePub with the pandoc command
  5. Generate the MOBI with the kindlegen command

I think the ePub and MOBI files can also use a stylesheet, so that’s something else to be looked into, but I’m more concerned about automating these processes at the moment.

alvinj avatar Oct 28 '19 23:10 alvinj

I’m getting back into this process this weekend, and I just want to note that the main part of the ePub process is running this command after making some slight tweaks to the Markdown files:

pandoc -o ScalaBook.epub \
  metadata.txt \
  introduction.md \
  prelude-taste-of-scala.md \
  preliminaries.md \
  ... many more markdown filenames here ...

For that command, the metadata.txt file looks like this:

---
title: Scala Book
author: Alvin Alexander, et al
rights:  Creative Commons Non-Commercial Share Alike 3.0
language: en-US
---

You can add more information to that file, as well as a stylesheet, as described in this medium.com article.

That pandoc command currently has a problem with Markdown tables, which is what I’ll work on next. But once you have a working ePub file, you can then create a MOBI file with this command:

kindlegen ScalaBook.epub

The PDF-generating process is much more involved, but this process is pretty simple.

alvinj avatar Nov 18 '19 01:11 alvinj

I put the initial tools for creating EPUB and MOBI versions here:

I don’t know how to get the website’s markdown files into the proper directory, but that should be a simple configuration change.

I’ll work on the PDF process next.

alvinj avatar Dec 16 '19 17:12 alvinj

Any update on this? I'd love to read "A Taste of Scala 3" as a PDF on my remarkable. Right now this chapter is split in many sections (I can do "print" from Chrome for each section but it's a bit annoying; I'd rather have a nice clean PDF of the whole chapter, of even better of the whole book).

aryx avatar May 02 '21 08:05 aryx

Note that it was less a problem for the Scala 2 book because "A Taste of Scala" was a single file: https://docs.scala-lang.org/overviews/scala-book/prelude-taste-of-scala.html so it was easy to print from Chrome as a PDF.

aryx avatar May 02 '21 08:05 aryx

Any update on this?

Technically I can generate PDF and MOBI versions pretty quickly. I think we were just waiting to have the book reviewed to get it in better shape. Then I just need to take out the time to remember the process. :)

alvinj avatar Jul 01 '21 20:07 alvinj