MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Configure external link recognition (and handling)

Open chrisjsewell opened this issue 5 years ago • 12 comments

Currently, if a link ([text](link)) does not match a URL scheme (e.g. 'http://...') then it is treated as an internal cross-reference (to a reference target or sphinx document, etc): https://github.com/executablebooks/MyST-Parser/blob/3d5ae4f94c9d39435d76861b86dc5171ee23c9df/myst_parser/docutils_renderer.py#L412-L415

In some use cases a configuration option could be useful, such that links with certain extensions (or regexes) are converted to external links, rather than attempting to resolve them as internal links

Originally posted by @chrisjsewell in https://github.com/executablebooks/jupyter-book/issues/823#issuecomment-668135898

chrisjsewell avatar Aug 03 '20 18:08 chrisjsewell

Same as #215?

jpmckinney avatar Apr 29 '21 13:04 jpmckinney

yep I guess so, and more or less #361

chrisjsewell avatar Apr 29 '21 13:04 chrisjsewell

although with #215 it is more than just leaving it as an external link, but turning it into a download link. the focus though is: more configurable handling of the value of the (link) seomthing like mapping the extension/regex to whether it is handled as an external link or a certain role, or some other logic

chrisjsewell avatar Apr 29 '21 13:04 chrisjsewell

recommonmark provided a url_resolver feature, where one could provide a custom handler to pre-process whatever is parsed as the url.

def url_resolver(url):
    ...
    return url

This made it very easy to make custom urls be parsed into whatever end result. Including stuff like [mylink](github:issues) being parsed into an external link to the github issue page, or [foobar](api:/foo/bar) becoming the correct full link to the api documentation page.

Griatch avatar Oct 01 '21 20:10 Griatch

recommonmark provided a url_resolver feature, where one could provide a custom handler to pre-process whatever is parsed as the url.

errr, it may be possible to do that here. But, I maybe fell you are stepping away from what a Markdown link actually is, and there are other ways to achieve this. For example, you could use a substitution for the link to github issues, like:

myst_substitutions = {
  "gh_issues": "[GitHub Issues](https://github.com/executablebooks/MyST-Parser/issues)"
}

Or you could create a new sphinx role for special links: {api}`foo/bar`

chrisjsewell avatar Oct 01 '21 20:10 chrisjsewell

@chrisjsewell While one can achieve the outcomes like you suggest, it requires quite a lot of changes to (in my case) vast amounts of already written Markdown documentation using recommonmark. Even though I suppose I could write scripts to try to convert, it's not really an ideal solution:

  • Neither {{ gh_issues }} nor {api}`foo/bar` are familiar Markdown for people. And while I as the maintainer can learn MyST syntax for advanced things, I have to explain this to a lot of relatively inexperienced contributors that should ideally be able to contribute something as simple as a link with familiar syntax. Lowering the threshold for contributing is an important thing.
  • Being able to write [Issues](github:issues) or [FooBar class](api:foo.bar#FooBar) (where "github:issues" and other urls are just strings I get to parse and modify as I like at build-time) is just a lot easier to work with IMO.

The url_resolver is a very powerful tool and seems to be a much more flexible solution for overriding defaults than trying to provide some way to 'recognize external links' as suggested in this issue. IMO of course. :)

Griatch avatar Oct 01 '21 21:10 Griatch

I personally just use Sphinx's extlinks extension, which I assume is essentially what @chrisjsewell is referring to as new Sphinx roles.

Sure, you might prefer the syntax to be [hyperlink text](issue:123) instead of {issue}`hyperlink text<123>`, but the latter works better for non-hyperlink roles (there are a lot in Sphinx), and it requires no extra code.

jpmckinney avatar Oct 02 '21 00:10 jpmckinney

@jpmckinney I don't expect [hyperlink text](issue:123) to carry any particular meaning at all out of the box. With url_resolver it would be up to me to parse the string "issue:123" and figure out that it means an url pointing to a particular github issue. What styling/shortcut to use would be up to me.

For Myst to be a replacement for recommonmark, this functionality is pretty critical though.

Griatch avatar Oct 02 '21 17:10 Griatch

For Myst to be a replacement for recommonmark, this functionality is pretty critical though.

Well myst has already replaced recommonmark 😉: https://github.com/readthedocs/recommonmark/issues/221

its not out of the question, but I feel this is more of a "power-user" feature, that could be easily replicated with an external transform: e.g. simply add to you conf.py

# available in the next release
myst_all_links_external = True
# or
myst_url_schemes = ("http", "https", "mailto", "ftp", "issue")

from docutils import nodes
from sphinx.transforms import SphinxTransform

class MyTransform(SphinxTransform):
    default_priority = 1
    def apply(self, **kwargs) -> None:
        for node in self.document.traverse(nodes.reference):
            if "refuri" in node and node["refuri"].startswith("issue:"):
                node["refuri"] = "whatever you want"
                node["classes"].append("my-class")

def setup(app):
    app.add_transform(MyTransform)

chrisjsewell avatar Dec 28 '21 07:12 chrisjsewell

What styling/shortcut to use would be up to me.

You'll see in my example above, as well as being more inline with how sphinx operates, you can even set CSS classes on the node and really do whatever you want with it

chrisjsewell avatar Dec 28 '21 07:12 chrisjsewell

Is it accurate to say this issue is why [this paper](downloads/the-paper.pdf) produces WARNING: 'myst' reference target not found: downloads/the-paper.pdf and elides the link in output?

jedbrown avatar Jan 10 '22 17:01 jedbrown

Yep, and that will change in myst-parser 0.17

chrisjsewell avatar Jan 10 '22 17:01 chrisjsewell