almanac.httparchive.org icon indicating copy to clipboard operation
almanac.httparchive.org copied to clipboard

Investigate using EPUB for ebooks

Open j9t opened this issue 4 years ago • 6 comments

The 2019 ebook refers to and depends on links, but none seem to be present at least in the version on Google Play Books.

Screenshots attached for both mobile and desktop. I haven’t checked the whole books but so far, nothing seems accessible. (This doesn’t seem intended—if it was, please consider links at least for author information and for anything indicated as a link, as with expressions starting with “http”.)

Screen Shot 2020-10-28 at 17 35 41

Screenshot_20201028-162047

Screenshot_20201028-162237


Also some centering issues as noted in #1391

j9t avatar Oct 28 '20 16:10 j9t

@tunetheweb seems like the URLs in the footnotes were all removed by Books. Any ideas to workaround that?

rviscomi avatar Oct 28 '20 16:10 rviscomi

Oh looks like they are gone in the PDF version too!: https://almanac.httparchive.org/static/pdfs/web_almanac_2019_en.pdf

The links still work (at least on PDFs) but the footnotes showing the URL are gone.

Will take a look.

tunetheweb avatar Oct 28 '20 16:10 tunetheweb

I tell a lie - they are still there online. Pheww.

@rviscomi, I don't show foot notes when the full URL is shown as seems a bit redundant.

So my profile for example at the bottom of the HTTP/2 chapter has this:

Barry Pollard profile

But only includes the hidden link to my book, at the bottom of the page:

Footnote 46

There is no footnote URL link to my social media icons, nor the Twitter account and website in the text. This was for presentational reasons as otherwise we ended up with loads and loads of footnotes that looked really untidy. To me the URL is obvious from those links so felt better to hide.

It appears Google Books does include the footnotes, so that's good. @j9t I presume that is what you are showing in your 3rd screenshot wiht Una's links showing?

However it looks like it removes the clickable links themselves 😞 Both form the original link and the footnotes. The PDF version has both links in the text, and in the footnotes as clickable, which is much nicer.

I suspect this is to do with the auto conversion of PDF to EPUB. I did look at converting our PDF to EPUB (using Calibre) but didn't get nice results. The table of contents for example is below:

Calibre EPUB conversion

So on one hand I'm impressed that Google Books did such a good job on converting so it looks nice. But unfortunately it doesn't retain links. I don't think we can solve this until we find a decent PDF -> EPUB conversion tool.

tunetheweb avatar Oct 28 '20 17:10 tunetheweb

On the chance that I can contribute somehow:

  1. For PDF to EPUB conversion there must be many tools—could some other tool do the job?

  2. What’s the source material—Markdown, HTML? I can’t tell how much work that would be to switch there, but Leanpub is one example for where that conversion works really well, generating decently formatted books from HTML or Markdown into PDF, EPUB, and MOBI. I swear on it, and the output is definitely compatible with Google Books.

Just on the chance this can be useful—I can tell that much already went into this.

j9t avatar Oct 28 '20 17:10 j9t

Just found this note: https://support.google.com/books/partner/answer/107073?hl=en&ref_topic=3238502

Hyperlinks If your PDF contains hyperlinks, either to other parts of the same book or to external websites, please note that the links will be disabled when your book is processed.

Looks like it is supported for EPUB though: https://support.google.com/books/partner/answer/3316879?hl=en&ref_topic=3238502

The source is HTML: https://almanac.httparchive.org/en/2019/ebook and CSS Page Media. We then use PrinceXML to convert to PDF.

I'm open to ideas on better EPUB converters. Calibre seemed a recommended free on last time I looked but, as I say, results weren't great.

tunetheweb avatar Oct 28 '20 17:10 tunetheweb

Renaming the issue to focus on the EPUB format which should solve the centering issue in #1391 and the clickable links issue reported here.

rviscomi avatar Dec 01 '20 03:12 rviscomi