language-markdown icon indicating copy to clipboard operation
language-markdown copied to clipboard

Add math-block notation

Open heavywatal opened this issue 8 years ago • 16 comments

As discussed in #189, the following notations are added to math-block notation:

\[
  a = \sum \frac 1 i
\]

\begin{equation}
  a = \sum \frac 1 i
\end{equation}

\begin{equation*}
  a = \sum \frac 1 i
\end{equation*}

heavywatal avatar Sep 01 '17 05:09 heavywatal

Cool!

As I feared, the CI tests failed. Are you familiar with that? I'd like to have a couple new ones added for these new math notations as well.

burodepeper avatar Sep 01 '17 08:09 burodepeper

Now \\[ ... \\] and \begin{equation} ... \end{equation} are properly recognized. (I noticed that one more backslash is needed in Markdown so that it is translated into \[ ... \] in HTML.)

I think \[ in HTML blocks should be matched as well because wrapping equations in <div> is a common work-around to prevent underscores from being interpreted by Markdown:

<div>\[
a = \sum_{i=1} \frac 1 i
\]</div>

Where should I add the pattern for this? maybe languege-html package, or here?

heavywatal avatar Sep 03 '17 04:09 heavywatal

Thanks for your efforts so far! I'm a little busy at the moment, so I haven't had time to review your PR in depth yet. From what I could see, you haven't added any spec that tests for situations that it shouldn't match. Could you add a couple of those? Think about situations where a similar notation might be used, or the good notation with a typo, etc. This will help cover any regressions in the future.

The pattern you mentioned should go into language-html. Although I'm not sure why it is needed. Rather, it seems that it shouldn't be needed. Maybe the order in which the patterns are matched might also solve this issue. If it does, be sure to leave a comment describing so.

burodepeper avatar Sep 04 '17 20:09 burodepeper

Please take your time. I am happy with using my branch on Atom for the time being.

I guess the matching order is not the only problem here because the required number of backslashes is different: \\[ in Markdown, \[ in HTML. But I agree that this should be in language-html.

heavywatal avatar Sep 06 '17 09:09 heavywatal

I think the code looks fine but it depends on https://github.com/area/language-latex/pull/185. Otherwise the latex highlighting will be off.

kylebarron avatar Jul 23 '18 13:07 kylebarron

I may be a bit confused. Given that we need to write \\[ in Markdown to generate \[ in HTML, I am happy with what I get from this PR as shown in the following figure: screen shot 2018-07-23 at 22 51 49 Are you trying to make \[ work in Markdown by the PR 185?

heavywatal avatar Jul 23 '18 14:07 heavywatal

Given that we need to write \[ in Markdown to generate [ in HTML

What flavor of Markdown are you using?

By default Pandoc supports only $...$ for inline math and $$...$$ for display math.

There are non-Pandoc extensions for both \[ ... \] and \\[...\\].

Extension: tex_math_single_backslash

Causes anything between \( and \) to be interpreted as inline TeX math, and anything between \[ and \] to be interpreted as display TeX math. Note: a drawback of this extension is that it precludes escaping ( and [.

Extension: tex_math_double_backslash

Causes anything between \\( and \\) to be interpreted as inline TeX math, and anything between \\[ and \\] to be interpreted as display TeX math.

I'm surprised to see that there's actually no definition for math in the Commonmark spec.

So which of the following should we color as math environments?

  • $...$
  • $$...$$
  • \(...\)
  • \[...\]
  • \\(...\\)
  • \\[...\\]

kylebarron avatar Jul 23 '18 14:07 kylebarron

Oh, I didn't know that pandoc has such options. How does pandoc process equations? I am using Hugo/Blackfriday for HTML generation, and MathJax for equation rendering. Hence for me it is not surprising that CommonMark does not have math specs. It is MathJax's job. What Markdown has to do is generating proper HTML code to be interpreted by JavaScript. From that point of view, tex_math_single_backslash is inconsistent with CommonMark.

According to short-math-guide.pdf from AMS-Math (see below), \begin{equation} ... \end{equation} and \begin{equation*} ... \end{equation*} should also be interpreted as equations. Of course MathJax supports it. But I am not sure if it should start with single- or double-backslash in this case because either works well with Hugo.

In summary, the following patterns in Markdown should be colored IMO:

$...$
$$...$$
\\(...\\)
\\[...\\]
\begin{equation} ... \end{equation}
\begin{equation*} ... \end{equation*}

Using $$ ... $$ is not recommended in short-math-guide.pdf, and I don't use that. But some people might want it...?

screen shot 2018-07-24 at 00 17 46

heavywatal avatar Jul 23 '18 15:07 heavywatal

How does pandoc process equations?

If converting from markdown to HTML, it converts to the appropriate HTML and uses either MathJax or KaTeX. If converting to PDF/LaTeX, it converts to the appropriate LaTeX commands and then optionally compiles the LaTeX to PDF.

I'm in agreement for

$...$
$$...$$
\\(...\\)
\\[...\\]

Supporting $$...$$ is a must because that's the standard Pandoc way to write display math.

I'm also in agreement for the begin-end clauses, but I'm curious why you suggest those. Does Blackfriday implement those at all? I support those because Pandoc supports Raw LaTeX input. However if we're going to support \begin{equation} ... \end{equation}, we need to also support other environments from AMSMath. For example, the align environment is used to vertically align several equations in a row, but an error is produced if you try to nest align within equation and the practice is not recommended.

For example, this raises an error with pandoc test.md -o test.pdf -s:

# testing

\begin{equation}
\begin{align}
a = \sum \frac 1 i
\end{align}
\end{equation}

So if we want to support \begin{equation} ... \end{equation}, as I believe we should, then we should support a something like

{
begin: '\\begin\\{(align|equation|multline|split|gather|alignat|aligned|gathered|eqnarray|array|tabular)(\\*)?\\}'
end: '\\end\\{$1\\}'
}

As an aside, my attitude about Commonmark and Math is this:

I strongly support inclusion of Pandoc-style inline math notation, with at least the status of “you don’t have to do anything with this, but you must still parse it correctly”. For instance, in

Let $y = m * x + b$ where $b$ is US$50,000

the asterisk in m*x should not be interpreted as an emphasis marker, and conversely, the dollar sign in US$50,000 should not be interpreted as a math shift.

kylebarron avatar Jul 23 '18 16:07 kylebarron

Supporting $$...$$ is a must because that's the standard Pandoc way to write display math.

Got it.

Does Blackfriday implement those at all?

No. Blackfriday knows nothing about math equations, as CommonMark does. It just generates HTML code like this:

<p>\begin{equation}
\sum \frac i a
\end{equation}</p>

which is later processed by MathJax, KaTeX, or any JavaScript libraries. I totally agree with you and your quote: “you don’t have to do anything with this, but you must still parse it correctly”.

So if we want to support \begin{equation} ... \end{equation}, as I believe we should, then we should support a something like

Agree. But I am not sure which environments should be included. For example, in the list you suggested, eqnarray is obsolete AFAIK. Maybe it can be a separate issue/PR. Shall we start small from just equation for now?

heavywatal avatar Jul 23 '18 23:07 heavywatal

I didn't take the time to see which of those were still used.

It's probably easier to take a few minutes and include a good list in this PR.

kylebarron avatar Jul 23 '18 23:07 kylebarron

OK, I have just added simple ones: align|equation|multline|split|gather. It is consistent with language-latex.

I excluded alignat|aligned|gathered because they seem to be complicated and require one more argument like \begin{alignat}{2}.

heavywatal avatar Jul 24 '18 01:07 heavywatal

Alternatively, similar to how we include all rules from HTML, we could include all rules from latex after the Markdown rules. But I'm guessing that would have more side effects

kylebarron avatar Jul 24 '18 02:07 kylebarron

I like the idea, and it seems to be a powerful and beautiful solution, but far beyond my ability and resource.

heavywatal avatar Jul 24 '18 02:07 heavywatal

No I mean instead of writing them all ourselves, as long as the user has language-latex installed, we can just write

{include: 'text.tex.latex'}

And all of the latex highlighting rules get applied after the Markdown ones (depending on where that line is placed)

kylebarron avatar Jul 24 '18 02:07 kylebarron

Yes, I think I see your point. I mean considering the side effects of including whole language-latex functions will be much harder than keeping math block grammars by ourselves. LaTeX is not just about math equations, but here we should limit the scope to math blocks.

heavywatal avatar Jul 24 '18 02:07 heavywatal