gitea icon indicating copy to clipboard operation
gitea copied to clipboard

Add KaTeX rendering to Markdown.

Open zeripath opened this issue 3 years ago • 9 comments
trafficstars

This PR adds mathematical rendering with KaTeX.

The first step is to add a Goldmark extension that detects the latex (and tex) mathematics delimiters.

The second step to make this extension only run if math support is enabled.

The second step is to then add KaTeX CSS and JS to the head which will load after the dom is rendered.

Fix #3445

Signed-off-by: Andrew Thornton [email protected]

zeripath avatar Jul 31 '22 16:07 zeripath

OK TODO:

  1. Check how much this increases the size of Gitea. If it's considerable then we may need to make this a build option
  • This appears to be 1Mb so I don't really think it matters.
  1. I think the observer should really replace the math element so as to avoid having to re-render things.
  • Done.

zeripath avatar Jul 31 '22 16:07 zeripath

OK it appears to be only 1mb difference with the bindata builds so that's good.

zeripath avatar Jul 31 '22 16:07 zeripath

The second step is to then add KaTeX CSS and JS to the head which will load after the dom is rendered.

We should use code-splitting based on existance of .markup code.language-math, like it's done for Mermaid.

silverwind avatar Aug 01 '22 15:08 silverwind

Is it possible to support LaTeX blocks? E.g.

\begin{alignat*}{2}
k(t) &= {\frac {x_{1}'(t)\,x_{2}''(t)-x_{1}''(t)\,x_{2}'(t)}{\left(x_{1}'\left(t\right)^{2}+x_{2}'\left(t\right)^{2}\right)^{{3}/{2}}}},\qquad &&N(t) ={\frac {1}{\|\gamma '(t)\|}}{\begin{bmatrix}-x_{2}'(t)\\x_{1}'(t)\end{bmatrix}} \\
R(t) &= \left|{\frac {\left(x_{1}'\left(t\right)^{2}+x_{2}'\left(t\right)^{2}\right)^{{3}/{2}}}{x_{1}'(t)\,x_{2}''(t)-x_{1}''(t)\,x_{2}'(t)}}\right|\qquad {\text{and}}\qquad &&Q(t)=\gamma (t)+{\frac {1}{k(t)\|\gamma '(t)\|}}{\begin{bmatrix}-x_{2}'(t)\\x_{1}'(t)\end{bmatrix}}\,.
\end{alignat*}

I'm not sure how markdown support for this is, but assuming this also will apply to Org files, LaTeX blocks are part of the Org syntax spec.

tecosaur avatar Aug 01 '22 16:08 tecosaur

As for syntax, I'm in favor of either only supporting the fenced syntax, or both, but the fenced syntax should be preferred as it plays nices with other markdown syntax mixed in. References:

silverwind avatar Aug 01 '22 16:08 silverwind

As for syntax, I'm in favor of either only supporting the fenced syntax, or both, but the fenced syntax should be preferred as it plays nices with other markdown syntax mixed in. References:

Mathematicians write LaTeX all the time. It is completely unnatural for them to write this using only fenced blocks. For a start it prevents inline equations.

Relegating maths to using the fenced only system - which you'll note I did not even add to this PR initially because it so unnatural - is frankly not going to be a solution for them.


I'm happy to add in a fence block support and I assume that would mean that you would want KaTeX support to be permanently available like mermaid if so. (Certainly some of your comments appear to imply that - at present if you set MATH_ENABLED=false you will not load or run any KaTeX code.)

But I would strongly argue that we to leave in the option for support of LaTeX style primitives within the markdown parser. Fenced code blocks are just not enough.

zeripath avatar Aug 01 '22 21:08 zeripath

I'm happy to add in a fence block support and I assume that would mean that you would want KaTeX support to be permanently available like mermaid if so. (Certainly some of your comments appear to imply that - at present if you set MATH_ENABLED=false you will not load or run any KaTeX code.)

I recall GitHub's math rendering specifially has some issues/conflicts with markdown syntax, that's why I thought fenced syntax was more compatible because it's a clean-cut separation. Some issues I've read about from GH's implementation:

  • GitHub first applies the Markdown parser, then the math parser
  • The markdown parser messes with the math (because it thinks $ and $$ is just text). For example it removes backslashes
  • If there's Markdown/HTML syntax in math (e.g., a <b > c), it won't render
  • If there's math in Markdown syntax (e.g., in lists and headers), it won't render

Yes, we should have it as a baseline feature, but happy to have toggles in the config to disable it, we could also add MERMAID_DISABLED, that just exists the JS before the code-splitting takes action.

silverwind avatar Aug 01 '22 22:08 silverwind

I recall GitHub's math rendering specifially has some issues/conflicts with markdown syntax, that's why I thought fenced syntax was more compatible because it's a clean-cut separation. Some issues I've read about from GH's implementation:

The markdown extensions are optional and can be turned off. Their priority can be changed to higher or lower depending on whether they're being affected by other markdown but I think the current priority is probably ok. If you look at #20640 I've made it possible to even decide whether to have the extension partially on or not depending on the frontmatter in the document. This would allow mathematicians to simply add following three lines:

---
math:all
---

Then they would be able to use $ inline math and it only affect that document.

The parameter can be extensively tuned - you'd need to look at the tests as I haven't written the documentation.

  • GitHub first applies the Markdown parser, then the math parser

This is plugged in directly into our markdown parser and interrupts the markdown flow.

  • The markdown parser messes with the math (because it thinks $ and $$ is just text). For example it removes backslashes

This doesn't happen in this PR. This is a true markdown extension.

  • If there's Markdown/HTML syntax in math (e.g., a <b > c), it won't render

Have you tried this on this PR? It works correctly. It is a true markdown extension.

  • If there's math in Markdown syntax (e.g., in lists and headers), it won't render

Again this works fine it's a true markdown extension

where \(\tilde{\lambda}\in (\Z^{2n}_+)^{\mathrm{Hom}(F, \overline{\mathbb{Q}}_p)}\) is the highest weight 
of the representation \(\iota^{-1}(\xi\otimes \xi)^\vee\) of \((\mathrm{Res}{F/\mathbb{Q}}\mathrm{GL}{2n}){\overline{\mathbb{Q}}_p}\).[^1]

* If \(F_0\subset F\) is an imaginary quadratic field and \(\ell\) is a prime
which splits in \(F_0\) (including possibly \(\ell=p\)), then for each prime \(v\mid \ell\) of \(F\) lying above 
a prime \(\bar{v}\) of \(F^+\), there is an isomorphism 

\[
\mathrm{WD}\left(r_{\iota}(\pi)|{G{F_v}}\right)^{F-ss} \simeq \mathrm{rec}^T_{F_v}
(\pi_{\bar{v}}\circ \iota_v). 
\]

\[ a < b > c \]

\[x=5\]

[^1]: For each \(\overline{\tau}:F^+\to \mathbb{C}\), \(\xi\) gives a representation of \(\mathrm{tG}{\overline{\tau}}\) and hence for \(\tau,\tau c\) extending \(\overline{\tau}\) to \(F\) we have a represententation \(\xi\otimes\xi\) of \(\mathrm{tG}_{\overline{\tau}}\times \mathrm{tG}_{\overline{\tau}} = (\mathrm{GL}_{2n,F}){\tau} \times (\mathrm{GL}{2n,F}){\tau c}\). 
Note that \(r_{\iota}^{\vee,c}(1-2n)\cong r_{\iota}(\pi)\) so \(\mathrm{HT}_{\tau}(r_{\iota}(\pi))\) and \(\mathrm{HT}_{\tau c}(r_{\iota}(\pi))\) can be read off from each other.

Becomes:

Screenshot from 2022-08-02 22-38-57

Yes, we should have it as a baseline feature, but happy to have toggles in the config to disable it, we could also add MERMAID_DISABLED, that just exists the JS before the code-splitting takes action.

I've changed the code so the toggle is just there to enable/disable the markdown extension ```math will still be rendered with KaTeX if it's not enabled.

#20640 provides a more significant change that would allow per document adjustment to adjust the math extension to run or not.

zeripath avatar Aug 02 '22 21:08 zeripath

Codecov Report

Merging #20571 (a7d60af) into main (6361b48) will decrease coverage by 0.03%. The diff coverage is 24.68%.

@@            Coverage Diff             @@
##             main   #20571      +/-   ##
==========================================
- Coverage   47.12%   47.09%   -0.04%     
==========================================
  Files        1008     1016       +8     
  Lines      137718   138044     +326     
==========================================
+ Hits        64899    65007     +108     
- Misses      64884    65098     +214     
- Partials     7935     7939       +4     
Impacted Files Coverage Δ
models/activities/repo_activity.go 61.50% <ø> (ø)
models/repo/repo.go 61.73% <ø> (ø)
modules/markup/markdown/convertyaml.go 0.00% <0.00%> (ø)
modules/markup/markdown/math/block_node.go 0.00% <0.00%> (ø)
modules/markup/markdown/math/inline_node.go 0.00% <0.00%> (ø)
modules/markup/markdown/renderconfig.go 0.00% <0.00%> (ø)
modules/packages/vagrant/metadata.go 79.62% <ø> (ø)
modules/setting/setting.go 49.58% <ø> (ø)
routers/api/packages/pypi/pypi.go 69.91% <ø> (ø)
routers/web/repo/view.go 43.25% <ø> (ø)
... and 23 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Aug 02 '22 21:08 codecov-commenter

Works quite well, but something needs to be done about loading spinner at least for the inline math, probably just make it smaller.

image

silverwind avatar Sep 01 '22 21:09 silverwind

This should fix it:

diff --git a/web_src/less/animations.less b/web_src/less/animations.less
index 92a3052a1..ea31d53bf 100644
--- a/web_src/less/animations.less
+++ b/web_src/less/animations.less
@@ -32,8 +32,15 @@
 .editor-loading.is-loading {
   height: var(--height-loading);
 }

+code.language-math.is-loading::after {
+  padding: 0;
+  border-width: 2px;
+  width: 1.25rem;
+  height: 1.25rem;
+}
+
 @keyframes fadein {
   0% {
     opacity: 0;
   }
image

silverwind avatar Sep 01 '22 21:09 silverwind

Oh and one last tweak: As per our browser support, we don't need the ttf and woff assets, so we can reduce the amount of font assets added to a third:

diff --git a/webpack.config.js b/webpack.config.js
index 974af15c0..7590256cd 100644
--- a/webpack.config.js
+++ b/webpack.config.js
@@ -33,8 +33,12 @@ const filterCssImport = (url, ...args) => {
     if (/brand-icons/.test(importedFile)) return false;
     if (/(eot|ttf|otf|woff|svg)$/.test(importedFile)) return false;
   }

+  if (cssFile.includes('katex') && /(ttf|woff)$/.test(importedFile)) {
+    return false;
+  }
+
   if (cssFile.includes('font-awesome') && /(eot|ttf|otf|woff|svg)$/.test(importedFile)) {
     return false;
   }

silverwind avatar Sep 01 '22 22:09 silverwind

Done @silverwind

zeripath avatar Sep 02 '22 22:09 zeripath

Okay, besides above, I think the final thing that's missing is updating docs (feature table, remove katex reference in doc/advanced/external-renderers.en-us.md).

Oh, and there's various linter errors.

silverwind avatar Sep 02 '22 23:09 silverwind

Okay, besides above, I think the final thing that's missing is updating docs (feature table, remove katex reference in doc/advanced/external-renderers.en-us.md).

That's for doing server-side rendering of katex - which is a different thing.

zeripath avatar Sep 03 '22 07:09 zeripath

oh ffs this has been conflicted under my feet - I think from the recent issuetemplates PR.

Yes it was from #20987. I'm not certain I understand why the Valid() function has been removed from that struct.

zeripath avatar Sep 03 '22 07:09 zeripath

Noticed on potential issue. Take this source:

When $a \ne 0$, there are two solutions to $(ax^2 + bx + c = 0)$ and they are 
$$ x = {-b \pm \sqrt{b^2-4ac} \over 2a} $$

According to https://github.blog/2022-05-19-math-support-in-markdown/ it should render like this:

image

On gitea it just omits the second line completely:

image

I think Katex fails to render it, but does not throw an error either, just outputs mostly empty HTML:

image

Note I can not get it to work on GitHub currently either, but at least they do show the unrendered source:

image

silverwind avatar Sep 03 '22 11:09 silverwind

Is this as ```math section or as plain markdown?

zeripath avatar Sep 03 '22 11:09 zeripath

Hmm... I see that Github appear to suggest that they've just plainly enabled inline dollar math?

Really?


When $a \ne 0$, there are two solutions to $(ax^2 + bx + c = 0)$ and they are

$$ x = {-b \pm \sqrt{b^2-4ac} \over 2a} $$


OK I stand corrected we can simply enable that too.

zeripath avatar Sep 03 '22 11:09 zeripath

~~OK there's a genuine bug - I've been able to reproduce - gonna look at it.~~


~~So the block renderer doesn't appear to be coping with the rest of the line which seems a bit strange to me...~~

~~yup I've written it to expect $$ on a single line by itself. I'll need to do some more work on this.~~


DONE and FIXED

zeripath avatar Sep 03 '22 11:09 zeripath

Is this as ```math section or as plain markdown?

Plain markdown. By the way I'm quite shocked at how broken GitHub's renderer is. The first comment is a fenced math block, it does not render at all for some reason.

silverwind avatar Sep 03 '22 11:09 silverwind

Is this as ```math section or as plain markdown?

Plain markdown. By the way I'm quite shocked at how broken GitHub's renderer is. The first comment is a fenced math block, it does not render at all for some reason.

Screenshot from 2022-09-04 14-48-47

These work well on the implementation in this PR.

zeripath avatar Sep 04 '22 13:09 zeripath

Unrelated failure during dryrun, I think https://github.com/go-gitea/gitea/pull/21078 will fix it.

silverwind avatar Sep 06 '22 09:09 silverwind

Problem seems to be that Docker build executes go-licenses before installing node_modules because there is no make dependency on node_modules, but in any case, we should just land https://github.com/go-gitea/gitea/pull/21078 to fix it.

silverwind avatar Sep 06 '22 10:09 silverwind

make lgtm work

zeripath avatar Sep 13 '22 13:09 zeripath