markdown-mode icon indicating copy to clipboard operation
markdown-mode copied to clipboard

Sub- and superscript detection

Open saf-dmitry opened this issue 6 years ago • 9 comments

I find the current sub- and superscript detection too restrictive. What is the reason to limit sub- and superscript expression to just alphanumeric characters?

saf-dmitry avatar Jul 11 '18 13:07 saf-dmitry

I don't think there's a reason; that's just how the original patch was written. Here are Pandoc's rules: http://pandoc.org/MANUAL.html#superscripts-and-subscripts.

jrblevin avatar Jul 13 '18 14:07 jrblevin

Here are Pandoc's rules: http://pandoc.org/MANUAL.html#superscripts-and-subscripts.

Well, in this case sub- and superscript can still contain escaped spaces, punctuation, and other symbols.

Therefore I suggest following regexp. This regexp allows any backslash-escaped symbol, incl. whitespace, or any other symbol excl. space and tab to be part of sub- or superscript expression.

(defconst markdown-regex-sub-superscript
  "\\(?:^\\|[^\\~^]\\)\\(\\([~^]\\)\\(\\(?:\\\\.\\|[^[:space:]]\\)+?\\)\\(\\2\\)\\)"
  "The regular expression matching a sub- or superscript.
The leading un-numbered group matches the character before the
opening tilde or caret, if any, ensuring that it is not a
backslash escape, caret, or tilde.
Group 1 matches the entire expression, including markup.
Group 2 matches the opening markup--a tilde or caret.
Group 3 matches the text inside the delimiters.
Group 4 matches the closing markup--a tilde or caret.")

saf-dmitry avatar Jul 16 '18 06:07 saf-dmitry

If there is a way to implement this functionality it would be of broad use to many doing scientific writing.

Rendering in buffer of a common scientific usage of superscripts, like in 1.0 × 10^−15^ or similar, is broken currently, for me at least (markdown-mode ver 20220116.209)

1.0 × 10^15^ renders as expected.

Thanks :)

steveb-123 avatar Jan 17 '22 11:01 steveb-123

@steveb-123 How about latest version ? It highlights as below screenshot.

superscript

syohex avatar Jan 17 '22 14:01 syohex

yup, that fixed it thanks!

I was surprised how far back the ELPA version was actually.

I got a little confused as I didnt noticed you merged that just now! It took me a minute to work out how to switch to the Github branch in Doom Emacs. For any fellow Doomers, adding this to packages.el got me onto the github source of the package with no further config:

(package! markdown-mode :recipe (:host github :repo "jrblevin/markdown-mode"))

steveb-123 avatar Jan 17 '22 16:01 steveb-123

@syohex That's nice! However, what about Unicode minus sign U+2212? I would suggest allowing it in sub- and superscripts too.

saf-dmitry avatar Jan 17 '22 21:01 saf-dmitry

Well said, I didn't notice that minus was not in there.

I think both plus and minus would keep many chemists and biochemists happy also.

steveb-123 avatar Jan 18 '22 14:01 steveb-123

I think both plus and minus would keep many chemists and biochemists happy also

Therefore I suggest this regexp https://github.com/jrblevin/markdown-mode/issues/346#issuecomment-405159648. The regexp allows any backslash-escaped symbol, incl. whitespace, or any other symbol excl. space and tab to be part of sub- or superscript expression.

saf-dmitry avatar Jan 18 '22 14:01 saf-dmitry

I think both plus and minus would keep many chemists and biochemists happy also.

Speaking of chemistry: Unfortunately, since plus and minus are allowed as leading signs only, we still cannot use superscripts for denoting ions, like Fe^2+^.

saf-dmitry avatar Jan 19 '22 17:01 saf-dmitry