ReLaXed icon indicating copy to clipboard operation
ReLaXed copied to clipboard

Table of Contents

Open Ciantic opened this issue 6 years ago • 17 comments

Is it possible to generate page numbers to toc, and having toc rows clickable? Also clickable references section is required.

That was so hard that I myself resorted to LaTeX instead of Chrome few months ago (didn't know about ReLaXed back then though).

Basically to support those one needs to do rendering in multiple passes, which is the reason LaTeX is so damn slow sometimes. It renders the page, get's the page numbers, make changes by inserting page numbers, re-renders the page and ensures the page numbers didn't change by the insertion etc.

Ciantic avatar May 01 '18 08:05 Ciantic

Yes for "having clickable toc rows", but as for page number id say it is almost certainly impossible. Some PDF viewers will have a sidebar with sections and page number, but that won't be of any use if you intend to print your document.

Zulko avatar May 01 '18 08:05 Zulko

I have some crazy idea:

Have all numbers e.g. 1-300 overlayed top of each other, and post-processing style remove the unneeded ones from the PDF document by looking at the underlying link to section.

This trick works only if the page numbers can be absolutely positioned, like in TOC usually.

Not expecting you to implement this, but I leave this here as I might myself investigate this. (Desperately would like to replace LaTeX, it's a mess)

Ciantic avatar May 01 '18 08:05 Ciantic

So I had a bit of a play with this earlier, and my initial thinking was to use a Pug plugin and work on the AST that is generated, but it seems that the filters generate Text nodes once they are run, so if there are headings in a :markdown-it section, then they don't become part of it, which is a shame because that would have been a nice way to potentially do citations etc as well.

My current thinking is that it would have to done once the HTML is generated.

@Zulko Did you have any ideas currently on this? I guess an automatic TOC would need to take into account all ways of generating a <h?> tag, rather than just pug?

benperiton avatar May 07 '18 15:05 benperiton

I believe the safest way would be some javascript exploring the DOM when all the rest of the page has finished rendering (probably a jquery plugin). Still, this would give you a list of links to sections, but not the page number.

Zulko avatar May 07 '18 15:05 Zulko

I created a working Table of Contents system: ToC. It is still a bit of a work in progress, but it works fairly well, there are some slight improvements to the workings of the bibliography as well, though no new features there. It adds a few mixins:

  • +TableOfContents(maxDepth=3) Create from the headings, defaults to (h1, h2, h3)
  • +ToC(maxDepth=3) Alias for above
  • +newPage Creates a new page by creating a page break, (has .new-page class).

The new page is done with the selections of the h#, this should be used instead of any custom option as the class is important so the generator can see it. Before the pdf is rendered the page break does nothing so the headings can still appear on the same page. The class forces a page increase for the table of contents.

Still need to test linking, but page numbers are working! I am not going to do a pull request yet, as there is still work to be done, I also want to do some improvements to bibliography as well.

Table of Contents is generated purely inside puppeteer, no additional packages needed!

Drew-S avatar May 26 '18 04:05 Drew-S

Note that there is already a TOC in ReLaXed documents, just not printable (some PDF viewers will show it in the sidebar). For an in-document TOC, there are many many cave-ats related to page numbers, including the fact that the TOC itself may shift the page by one or two, the fact that the page numbers have to be evaluated in "print" layout mode, The fact that they will depend on the page height, but also accounting for left right top bottom margins which are set at PDF writing time. And a broken TOC will bring us more bug fix requests than no TOC at all. So I am a bit torn by this issue right now.

Zulko avatar May 26 '18 10:05 Zulko

I am not seeing where the ToC is generated, I assume puppeteer generates it then?

So far my ToC is not complete, but it is printable, it generates the text where the +ToC is called then goes through and obtains the page numbers, so the ToC shifts everything down before finishing the page numbers. It also currently takes into account the page format, but it does not take into account the page margins, so its page numbering is not correct yet.

The +newPage also introduces a slight error that I still need to take into account.

Drew-S avatar May 26 '18 16:05 Drew-S

I did some more improvements to the ToC, still need to do plenty more testing to ensure its working fully. The ToC generation takes into account: the ToC shifting content, the top, right, left, and bottom margins, the height and width of the pages.

It does this by, when generating the ToC, force the body width to be the print width (8.5in) minus the left and right margins, it then increments a page number every time an element to mark is seen above an integer multiple of the height minus the top and bottom margins.

ToC:

  • src/generators.js 80-120 (capturing of @page margins)
  • src/generators.js 124 (body has fixed width)
  • src/generators.js 144 (page height for page number calculating)
  • src/generators.js 153, 156, 160 (+newPage error correction)

Drew-S avatar May 27 '18 16:05 Drew-S

Hi @Drew-S So is there an update on the TOC and when is ported to a plugin we can use? Would be very useful to have. THX

upstroke avatar Jun 19 '18 15:06 upstroke

I have kind of called for a break on this feature as it doesn't play well with Chromium (there are cases that make it difficult to predict which page an element will en up on). See the end of this thread for the latest discussion:

https://github.com/RelaxedJS/ReLaXed/issues/88

I believe there is a possible solution using a processing of the PDF itself, but I didnt come around to test it yet.

Zulko avatar Jun 19 '18 16:06 Zulko

Unfortunately, I have been busy the last week and a half so have not done any work on any of this. As @Zulko has stated, it is placed on pause, as there is formatting issues that cannot be worked around. I spent some time looking through puppeteer issues. #1778 is a request for ToC features in Puppeteer itself.

I have opened up an issue on puppeteer for adding a hook into page.pdf() to be able to manipulate the DOM after it has been formatted for print #2773. If we get something like that then we could do most everything we need, ToC, Figure list, custom footers, etc.

Unfortunately it is out of our hands right now. We can still make improvements in other areas at least.

Drew-S avatar Jun 19 '18 17:06 Drew-S

Problem is this: Pupeteer team is not themselves trying to solve these problems.

This means that when you open a issue there, they can't understand what hurdles you have hit trying to solve it. Clearly @Drew-S gave a good try on this, and found a deficiency in Puppeteer, but they will dismiss these API changes for as long as they themselves try to do same.

E.g. the issue they asked people to make to Chromium (840455) is a joke. Will they make a one fits for all TOC? It makes no sense, none wants fully baked TOC in API, we need extensible API such as modifying the DOM.

Ciantic avatar Jun 20 '18 07:06 Ciantic

@Drew-S is there a way to use the mixins you built in an HTML doc? I'm experimenting with your fork on branch ToCplugin & TableOfContents and I've created a config.yml in my document root folder that contains:

plugins:
  - tableofcontents

Any help would be greatly appreciated :)

#109

Gee19 avatar Jul 27 '18 19:07 Gee19

Is it possible to get this plugin without requiring the page numbers?

rfizzle avatar Aug 09 '19 06:08 rfizzle

Is it possible to get this plugin without requiring the page numbers?

Can you explain further what you need?

DanielRuf avatar Aug 09 '19 13:08 DanielRuf

Is it possible to get this plugin without requiring the page numbers?

Can you explain further what you need?

I was able to fork the plugin Drew-S created. I just needed a digital ToC that auto generated inline and had links to headers for my PDFs. Even without page numbers, it's better than nothing.

image

rfizzle avatar Aug 21 '19 02:08 rfizzle

Looks cool. Do you want to open a PR to add this?

DanielRuf avatar Aug 21 '19 03:08 DanielRuf