igraph icon indicating copy to clipboard operation
igraph copied to clipboard

Math in documentation

Open szhorvat opened this issue 5 years ago • 12 comments

My number one wish for the documentation system is math support. It would be extremely useful if we could write proper math in typical LaTeX notation i.e. $ separators for inline math and either $$ or \[ ... \] separators for display math.

@ntamas Is this feasible?

It would seem to me that if the $ / \[ separators are literally included in the generated HTML, then it will be trivial to make them come alive on the online doc pages with MathJax. (This is how MathJax normally works.)

However, the online pages are not the only target for the documentation. In some situation one may want to generate standalone HTML or plain text. If the $...$ blocks are left unrendered in this version, I think that's fine, and definitely not worse than what we currently have. We also have PDF output. The $...$ blocks would look very ugly there. I would not be very happy about that, but personally I would make that tradeoff to be able to get nicely rendered math in the online version. I expect that very few people would want to use the printable PDF version anyway.

szhorvat avatar Jul 23 '20 15:07 szhorvat

One (big?) difficulty here is that the math sections would have to be isolated, any commands (like \em) within them should not be interpreted, and things like > or < would have to be quoted. Doing this with the current regex-based solution might not be easy.

szhorvat avatar Jul 24 '20 13:07 szhorvat

Since we use DocBook to generate the documentation (both HTML and PDF), the only solution I can conceive right now (completely untested, just a theory) is to:

  • add a new \math marker (and maybe an \inlinemath marker to distinguish between inline equations and the ones that are formatted in their own paragraph) that uses LaTeX notation
  • extend c-docbook.re to convert these markers into DocBook <equation> and <inlineequation> tags with the <alt> representation of the equation containing the LaTeX notation
  • update the XSL stylesheet that we use to convert DocBook to HTML to convert equations into the format understood by MathJax
  • use DBLaTeX to generate the PDF documentation

The other way would be to use MathML, which is meant to be the standard way of writing equations in DocBook, but it looks like MathML is so complex that most people prefer to just use LaTeX.

I don't think I'll have time for this in the near future but I leave this here as a first nudge in the (hopefully) right direction if anyone wants to start working on it.

ntamas avatar Jul 26 '20 09:07 ntamas

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 24 '20 16:09 stale[bot]

katex might be of interest: https://katex.org/ (I know it through http://docs.ropensci.org/katex/)

maelle avatar Feb 19 '24 08:02 maelle

Okay, so basically this would mean that we could use KaTeX instead of Mathjax in my proposal above to cover the HTML side and we could leave math as-is for the PDF version?

ntamas avatar Feb 19 '24 10:02 ntamas

how is the PDF generated? the katex R package covers that case by using latex macros (I don't know the details)

maelle avatar Feb 19 '24 11:02 maelle

Related: https://github.com/igraph/igraph.org/issues/41

maelle avatar Feb 19 '24 11:02 maelle

tl;dr KaTeX does not solve the problem here. I think the suggestion comes from a misunderstanding.


What KaTeX solves is display of math in a browser. This was never an issue. MathJax (mentioned above) could do this many years before KaTeX (and can arguably still do it better).

The issue is that we use the DocBook format, which can then be converted to a large variety of other output formats, and advantage I would rather not lose. We produce DocBook from the doc comments in the source code. DocBook represents mathematics using MathML, which is not suitable for writing by hand, and we can't include in doc comments.

Note that LaTeX is not a properly structured representation of mathematical expressions, and can't be reliably converted to MathML ...

There are some partial solutions to using LaTeX notation in DocBook, but I couldn't find anything complete that we can safely rely on. http://users.wfu.edu/cottrell/dbtexmath/

szhorvat avatar Feb 19 '24 12:02 szhorvat

What could be done is to switch to completely different method of generating the documentation, avoid DocBook, and just agree that we only support browser output, nothing else. That would be a major project that I don't think we have the capacity for at this moment ... it would likely involve having to convert the existing doc comment format as well.

szhorvat avatar Feb 19 '24 12:02 szhorvat

it would likely involve having to convert the existing doc comment format as well.

An interesting project together with the automatic updating of docs for all the interfaces? :innocent:

maelle avatar Feb 19 '24 12:02 maelle

What could be done is to switch to completely different method of generating the documentation, avoid DocBook

Note that we have recently accepted a PR in the C core that uses DocBook to generate docs in Texinfo format. Switching to a different method would mean abandoning Texinfo and/or having to find a documentation generation method that also supports Texinfo.

ntamas avatar Feb 19 '24 12:02 ntamas

automatic updating of docs for all the interfaces

I know that this is what motivated this discussion, but I think it's best if I'm very explicit here: Doing this is plainly not possible. Not if igraph is going to be anything more than just bindings to a C library. This becomes quite clear once you actually try to develop a realistic system to do it.

I am not against discussing it, or hearing proposals, but I am quite convinced that it is impossible.

szhorvat avatar Feb 19 '24 12:02 szhorvat