commonmark-spec icon indicating copy to clipboard operation
commonmark-spec copied to clipboard

Add specification for info strings with inline code

Open tgross35 opened this issue 2 years ago • 10 comments

This adds RST-style colon-delimited info strings to inline code, which can be used for code highlighting

:python:`foo("abc", a=123)`

This takes inspiration from RST roles.

tgross35 avatar Oct 22 '23 18:10 tgross35

This conflicts with emoji shortcodes such as used on here on GH, and likely conflicts with the generic directive proposal, which is in several popular projects such as markdown-it and the things I work on (micromark, remark, mdx, markdown-rs)

wooorm avatar Oct 22 '23 20:10 wooorm

Good point, that is unfortunate. Do you have a better suggestion, or should this just be dropped?

Maybe something like (foo)`code` / {foo}`code` could work

tgross35 avatar Oct 22 '23 21:10 tgross35

Hmm. Tough!

  • I think it’s important to add support for tagging inline code, and as with the other math discussion, to add support for evaluating said code.
  • The syntax has to look a bit like what it results. This seems a bit program-y to me! (which RST is, but markdown tries not to be, and I think that’s one of the reasons why markdown is great!)
  • Would be very nice to break as little existing markdown as possible, perhaps even be somewhat back-compatible/graceful-looking when unsupported
  • Would be nice to look like existing fenced (block) code

Fenced code supports multiple words, and it puts the thing after the opening backticks. So perhaps we could do something with `js: code`. The colon is not strong enough as a separator though. I can’t think of stronger punctuation that isn’t already used a lot in programming code? Perhaps your colon-on-both-sides idea? `:js: code`.

So like, if you start the code with a special starting sequence (:?), CM will look until a special end sequence (:?), and interpret the keywords between it as the info string, of which the first is the programming language?

Definitely not sure tho!

wooorm avatar Oct 22 '23 21:10 wooorm

This certainly isn't appropriate for the core specification.

I think several implementations support postfixed attributes inside curly braces:

`foo("abc", a=123)`{python}

Crissov avatar Oct 23 '23 09:10 Crissov

Pandoc has always supported a consistent attribute syntax for both code blocks and inline code.

``` {.python #ident key="value"}
code
```

And inline `foo = 3`{.python}

Note: the . indicates a class name.

This is what I would have preferred for commonmark, too, and I argued for it, but some of the others involved preferred to be less opinionated about the "info string."

jgm avatar Oct 23 '23 15:10 jgm

This certainly isn't appropriate for the core specification.

Could you elaborate on this?

It came up in discussion at https://github.com/commonmark/commonmark-spec/pull/745 where discussion was about adding some way to specify math inline `...` blocks, rather than adding something like $ ... $

tgross35 avatar Oct 23 '23 17:10 tgross35

* The syntax has to look a bit like what it results. This seems a bit program-y to me! (which RST is, but markdown tries not to be, and I think that’s one of the reasons why markdown is great!)

I wonder if a combination of @jgm's mention and current link syntax would work to be less programmy. `print("hi")`(python) reads to me as "here is some code (by the way it's python)" kind of like [some link](www.link.com) reads to me as "here is a link (by the way you can get it at link.com)". But not sure where this falls into breaking syntax

Fenced code supports multiple words, and it puts the thing after the opening backticks. So perhaps we could do something with `js: code`. The colon is not strong enough as a separator though. I can’t think of stronger punctuation that isn’t already used a lot in programming code? Perhaps your colon-on-both-sides idea? `:js: code`.

So like, if you start the code with a special starting sequence (:?), CM will look until a special end sequence (:?), and interpret the keywords between it as the info string, of which the first is the programming language?

I think I saw this discussed somewhere at some point, I'll try to dig up the thread I don't have any idea what a strong enough separator would be, almost every sigil is used in some language or other.

Definitely not sure tho!

Tricky question!

tgross35 avatar Oct 23 '23 17:10 tgross35

I actually like `print("hi")`(python) and think it is unlikely to break existing markdown.

We can make it even safer by only accepting one single word (relaxed with or \t, or more string even not allowing punctuation). That should solve much of the specifying a language case. At the downside of not allowing everything that’s in the fenced code info string (multiple words, arbtirary characters).

wooorm avatar Oct 25 '23 10:10 wooorm

This wouldn't be backwards compatible but something in the wild is Typst, which allows you to use info strings in multi-backtick inlines only.

```python print("hi")```

https://typst.app/docs/reference/text/raw/

tgross35 avatar Nov 29 '23 21:11 tgross35

Yeah, the Typst-based syntax seems like the most intuitive option for Markdown IMO. No change of existing syntax, really, except for an expansion of how it can be used.

EDIT: Sorry, missed that it wouldn't be backwards-compatible. That's unfortunate.

camelid avatar May 22 '25 23:05 camelid