MyST-Parser
MyST-Parser copied to clipboard
Better link syntax for cross-references
Describe the problem/need and solution
I've been using MyST for a bit and it's quite nice being able to use Markdown instead of RST. However, a major pain point is using the cross-referencing syntax. The {ref}`target` and {ref}`name <target>` feel like I am just using a slightly modified version of RST. They aren't very Markdownic, if that is a word. For me, they fail the basic smell tests of good Markdown syntax:
- Easy to remember The link syntax is basically the same as the RST syntax, except with brackets instead of colons. The RST syntax is notoriously hard to remember.
- Composable It's impossible to format a link as code (or if it is possible, only with some trick that I haven't yet figured out), because it reuses backticks.
- Joy to use Every time I have to make a cross-reference I feel the same slight pain I feel whenever I use RST. It's even worse if I haven't used it in a while and have to go lookup the syntax (and I think I've mentioned this a dozen times before but I'll mention it again, please use the word link in the docs to refer to link syntax. No one knows what a "roll" is).
Basically, it feels like RST syntax that has been shoved into Markdown rather than the way Markdown would actually implement such a thing.
MyST does let you write [name](target) and even [](target), and these are both great. But this only works when the target can be accessed with any. This unfortunately virtually never works for my use-case (cross-referencing functions with autodoc in the SymPy documentation).
I would propose extending the usual Markdown link syntax so that you can add the target type before the target name somehow. My suggestion would be to use a colon, like [name](func:target), but if can't work for whatever reason there could be other options. Another suggestion, which is less syntactically nice but would at least make sense logically, would be to allow {ref}`target` inside of a Markdown link, like [name]({ref}`target`) (IMO this should be done regardless of whether any other new syntax is added).
I would also suggest implementing this in a way so that the ~. style works so that something like [name](func:~.target) works (see https://github.com/executablebooks/MyST-Parser/issues/468), i.e., get the target for :func:`target` and then convert that into a link rather than just rewriting it to :func:`name <target>`.
Guide for implementation
No response
Tasks and updates
No response
Ugh, look I'm not ruling anything out, but... I really think you need to try coding some of this, to understand how feasible some of it is.
Conversely, I would suggest for you to write some cross-reference heavy text to understand how difficult the current syntax is to use.
and this is because it exactly is that; its taking the role name and content, and just passing it on to docutils/sphinx to handle
The syntax itself feels just copied from RST. Obviously the semantics need to be there, but it could have been anything that had those three parts. Why, for instance, does MyST use backticks for this? RST uses backticks because in RST backticks are what are used to denote cross-references. In Markdown, backticks mean code. And the <> part for the target is just copied straight from the RST. It's completely different from the usual Markdown way of making a link.
Reusing the triple backticks at least makes some sense because that's the only kind of "block" syntax in Markdown (although even that is somewhat annoying because my editor thinks every directive is a code block).
I disagree, that {rolename}
contentis a difficult syntax to remember, I feel you are conflating other aspects of RST syntax.
How can you "disagree" that something is hard to remember? I'm telling you that I've had a hard time remembering it (and every time I've looked it up I've had a hard time even finding it because the docs use terminology that I don't expect). You not believing me is not very productive.
It seems like every time I interact with you I have to deal with this same sort of thing, and frankly, it's getting tiring.
Colons are a core aspect of standard URL syntax, so this would not be a good idea.
The colons were just a suggestion to try to prompt discussion. I guessed that they probably wouldn't work for some reason. They point was to give an idea of the sort of thing that might be simpler.
In fact I don't feel that we should be trying to implement any kind bespoke syntax/regexes inside (); that's not very "Markodwnic"
Isn't allowing cross-reference inside of the parentheses already sort of a special case?
Anyway, I can tell you that the first thing I ever tried when I wanted to make a custom link was the [name]({ref}`target`) syntax I suggested in the other issue. If you want to aim for "principle of least surprise" that's a good place to start.
What is Markdownic are attributes, as stipulated by the creator of Markdown: johnmacfarlane.net/beyond-markdown.html#attributes (who I was talking with recently https://github.com/commonmark/commonmark-spec/issues/702#issuecomment-1094163179)
I haven't seen these before. Wouldn't a header attribute like described in the first link be preferable to the (target)= syntax currently used by MyST (that's obviously unrelated to this discussion, but if that syntax could be replaced with something that Markdown parsers actually understood that would be awesome).
This is a good example; the ~. style is purely a Python domain specific thing: sphinx-doc/sphinx@b4276ed/sphinx/domains/python.py#L75, it does not work generically for all references
You're misunderstanding the point of this. I don't think MyST should know about ~.. All I'm suggesting is to make the link separately from the parsing of the reference. Basically parse in two steps. I don't know if that's feasible.
Alternately Sphinx or docutils itself could be fixed in this regard. I don't really care how it gets fixed, but this is an annoying pain point.
Hey all - thanks both for sharing your thoughts and perspectives here. I agree with both of you about the pain-points (both on the syntax side, and on the implementation side). I think it's important that we hear each other out about our perspectives and focus on constructive conversation and debate.
In my opinion, it is useful to have design-level conversations (e.g. what would be the best user experience?) separately from implementation-level conversations (e.g., what is realistic given our limited development and maintenance resources). I think we should then consider both of them in coming up with a proposed path forward, but the pros/cons of one do not invalidate the other, they should simply be considered together. So, thanks @asmeurer for being open about your pain points here from a user's perspective, and to @chrisjsewell for providing an implementation-level dose of realism here :-) .
Context on why this is hard to implement in Sphinx
Quick context from what @chrisjsewell was saying. I believe the challenge with the <> syntax is that Sphinx itself (and extensions in Sphinx) hard-code that syntax as a part of their content parsers. E.g. the Sphinx CrossReference class uses a regex to search for it here:
https://github.com/sphinx-doc/sphinx/blob/31eba1a76dd485dc633cae48227b46879eda5df4/sphinx/util/docutils.py#L462-L466
So in Sphinx it's not really a part of the "parsing stage", the <> is just "part of the content block for a role" and is a convention that many extensions use because reStructuredText sort-of implicitly defines this syntax as a part of external hyperlinks.
MyST spec repo for discussion?
This also feels relevant to a few other conversations we've had over the months about how to extend the "role syntax" to include things like options:
- https://github.com/executablebooks/mystjs/issues/7
- https://github.com/executablebooks/MyST-Parser/issues/69
And I think more generally, something like this would be a good topic for discussion / conversation in the myst-spec repository, where we are trying to define the specification more formally: https://github.com/executablebooks/myst-spec
Just to be clear here, do you want me to open an issue for this on the myst-spec repo? You can also move this issue to that repo if you feel it would be better to live there.
Leaving to @choldgraf the decision whether to move this issue over to the spec repo (probably a good idea IMO), I'll comment here... Thx @asmeurer for this writeup! I have just begun to use these aspects more, and your perspective here is very valuable.
I think it's worth really looking at the user experience aside from the sphinx/docutils-imposed constraints: ultimately that is an internal implementation layer that could in the long run change. The reason so many of us moved from ReST to md was precisely user experience, and that was the entire reason why we had the original impetus for MyST way back when. We should continue looking for that fluid, joyful experience while writing and sharing content.
For what it's worth, I've always felt that the most intuitive Markdown link syntax extension for MyST would work as follows:
- Inline links work just the same as in standard Markdown. So anything of the sort
[title](target)(i.e., with parentheses) works as it would otherwise. Thetargetwould usually be an explicit link, likehttps://example.com, but may also be a relative link inside the project. There could be some magic there, along the lines of what GitHub does, like mapping thetargetindex.mdtoindex.htmlBut nothing that surprises users. Like the implicitdownloadrole tends to do (in my opinion). - Reference links, on the other hand, should do the actual magic. These use brackets for the target:
[title][target], wheretargetis looked up elsewhere.
The look-up for reference targets would essentially do what Python does when it looks up the name of an identifier:
- Check the local scope. Here, the current Markdown file. If
[target]:is defined elsewhere in the same document, use that. I.e., standard Markdown behavior. - If not that, go one level up: See what Sphinx would find for
targetwith theanyrole. - If not that, go to the global scope: Look up
targetwith Intersphinx.
The any look-up might be ambiguous, so maybe allow to specify Sphinx roles such as func: or download: (or :func:, :download:) as a prefix for target.
I know this would break a lot of stuff that's already working somewhat differently in MyST. But I feel(!?) this should be easy on the implementation side, once refactored, and line up better with what users would expect a "multi-document Markdown renderer" to do.
Just as a note, I have opened up an issue here which is aiming to track various places where there are suggestions/problems/improvements around cross references (including this issue). Another potential avenue that has been discussed is to adopt pandoc citation/reference syntax (starting with an @).
- See: https://github.com/executablebooks/MyST-Parser/pull/613, thoughts welcome
These use brackets for the target:
[title][target], where target is looked up elsewhere.
Just to note @john-hen this is not particularly easy, because any standard CommonMark parser will only recognise these as links, if it can match it to a target, i.e. this would break CommonMark compliance. (I've already thought about this before, and asked on the commonmark spec 😅 https://github.com/commonmark/commonmark-spec/issues/702)
@chrisjsewell Yeah, okay. But it seems that, between you and John MacFarlane, you both agree that that particular example from the CommonMark spec isn't all that useful. And I would add that it's certainly esoteric. I don't think a lot of people are writing things like [foo][bar][baz] in their Markdown documents and have high expectations as to how that would be parsed.
@john-hen I'm afraid that if you don't agree with commonmark specification, then you need to petition for that to change, it is not the business of MyST to go against that. MyST uses a commonmark compliant parser, so it would not be trivial to implement something that went against that anyhow