sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

Misleading documentation on rST hyperlink syntax.

Open gmilde opened this issue 6 months ago • 9 comments

Describe the bug

The reStructuredText Primer (doc/usage/restructuredtext/basics.rst) starts the section on External links with

Use ```Link text <https://domain.invalid/>`_`` for inline web links. 

which is, according to the reStructuredText Markup Specification the syntax for a "named hyperlink reference with embedded URI", the most complex and most misunderstood link syntax variant.

Users thus primed will be surprised when a subsequent "inline link" with the same link text results in a WARNING Duplicate explicit target name: "link text" because the syntax is equivalent to the link–target pair ("reference link")

Use ```Link text`_`` for hyperlinks. Add a target block::

    .. _link text: https://domain.invalid/

at a suitable place.

and generates two Doctree elements:

<reference name="Link text" refuri="https://domain.invalid/">Link text</reference>
<target ids="link-text" names="link\ text" refuri="https://domain.invalid/"></target>

The rST analogon to Markup "inline links" is an "anonymous hyperlink reference with embedded URI/alias" which uses a double trailing underline.

There is actually no good reason to use a named link with embedded href (with single trailing underline):

  • If the established reference name is used at several places, a separate named target adds clarity.

  • One-off use cases are better served with the anonymous variant (with double trailing underline) or an "normal" anonymous reference – target pair.

Suggestion:

  • Start the "primer" documentation section with "reference links". In rST, they are easier than "inline links".
  • For "inline links", write
    Use ```Link text <https://domain.invalid/>`__`` for inline web links 
    (**mind the double underline at the end**, for details see the `rST specification`__).
    
    __ https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#embedded-uris-and-aliases`__
    

How to Reproduce

index.rst:

The references `link text <https://domain.invalid/>`_
and `link text <https://example.com/>`_ generate a warning.

Environment Information

The bug is in the current online documentation https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#external-links

gmilde avatar Jun 03 '25 20:06 gmilde

See https://github.com/sphinx-doc/sphinx/pull/13576#discussion_r2096624497 for an example of the confusion when the "rST primer" is used as primary information on reStructuredText.

gmilde avatar Jun 03 '25 20:06 gmilde

If I am allowed to make a half-serious pedagogical suggestion to improve this reST Primer in our Docs: same as literals need two back ticks not one, hyperlinks are better done with two underscores not one... (I would hesitate saying "inline" hyperlink, because I have no clear idea what "inline" is supposed to be mean here, as if there was a "display block" type of hyperlink? mind that most Sphinx users exclusively belong to HTML rendering world and vocabulary).

jfbu avatar Jun 04 '25 20:06 jfbu

I definitely belong to the category of the surprised users...

jfbu avatar Jun 04 '25 20:06 jfbu

I have also been bitten by this before. I think "inline" is a misnomer here and the intention was to denote hyperlinks with embedded URIs. I've opened a PR to improve the docs.

timhoffm avatar Jun 04 '25 21:06 timhoffm

I have also been bitten by this before. I think "inline" is a misnomer here and the intention was to denote hyperlinks with embedded URIs.

It is a "commonmarkism". See inline links in the CommonMark spec.

gmilde avatar Jun 05 '25 05:06 gmilde

@AA-Turner: Thanks for the fix. Two issues remain:

In #13424, the expectation is for "https://en.wikipedia.org/wiki/Antenna_(radio)" to be converted to a link including the closing parenthesis. According to the rST's spec, "Punctuation at the end of a URI is not considered part of the URI, unless the URI is terminated by a closing angle bracket (>)." Adding a link to the specification of "standalone hyperlinks" would allow users to check there for details of URI recognition.

  URLs and email addresses in text are automatically linked an do not need
- explicit markup at all.
+ explicit markup at all (:duref:`ref <embedded-uris-and-aliases>`).

The description of "the best approach" is repeated at the end of the section:

- You can also separate the link and the target definition (:duref:`ref
- <hyperlink-targets>`), like this::
- 
-     This is a paragraph that contains `a link`_.
- 
-     .. _a link: https://domain.invalid/
- 
  Internal links
  ~~~~~~~~~~~~~~~~~

gmilde avatar Jun 10 '25 21:06 gmilde

@gmilde would you consider creating a pull request?

A

AA-Turner avatar Jun 10 '25 22:06 AA-Turner

The changes are rather small, so it should be easy to do without PR.

Currently I'm prioritizing the upcoming Docutils 0.22 release.

gmilde avatar Jun 11 '25 15:06 gmilde

Docutils commit r10176 changes the specification and implementation of named references with embedded URI or alias to create implicit targets instead of explicit targets.

Rationale: It seems, that up to now a majority of users did not know that

`link text <href>`_

creates both, a hyperlink reference and a <target> element refering to "href" with the reference name "link text" . Hence, the syntax should not be interpreted as an explicit intention to create a target. Still, like section titles, it provides a match of a link text to a link target that allows using the link text in a simple hyperlink reference without the need for an explicit target — if it is unique.

Implications: Using the same link text in another reference with embedded target no longer generates a warning (just like 2 section titles with same text).

see `here <example.com>`_ and `here <example.org>`_

As currently, an error is reported if a duplicate reference name is used

like here_.

A unique reference name can still be used in simple hyperlink references:

We love `Python <http://python.org/>`_ because Python_ is 
the best programming language.

An explicit target with the same reference name now silently overrides the implicit target.

The target named "here" is now _`here` and can be used like here_.

This means that with Docutils 0.22, it is OK to use named "inline references". The advantage over anonymous references is, that in case of a misspelling like

`link text<href>`_

the error message can point to the correct line while with anonymous targets only a mismatch in the number of references and targets can be reported.

In Sphinx, use of implicit targets in a :ref: role fails. This leads to a change of behaviour with Docutils 0.22:

.. _explicit internal:

A paragraph with an `external reference <http://example.com>`_ with embedded
URI and an `internal reference <explicit internal_>`_ with embedded alias 
(both named).

Using the refnames `external reference`_ and `internal reference`_ with
rST syntax works

Using ":ref:`external reference <external reference>`" with the 
Sphinx :ref: role fails.
                                            
Using ":ref:`internal reference <internal reference>`" with the 
Sphinx :ref: role works with Docutils < 0.22.rc5 but fails with HEAD.

A Sphinx :ref: to the :ref:`explicit internal <explicit internal>` 
reference works.

However, :ref: references to footnotes fail for both, implicit and
explicit targets:

.. [#] An auto-numbered footnote creates an implicit target.
.. [#myfoot] An auto-labelled footnote creates an explicit target.

rST links to footnote 1_ (implicit) and "myfoot_" (explicit).

Sphinx :ref: to footnote :ref:`1 <1>` (implicit) and ":ref:`myfoot
<myfoot>`" (explicit).

Is this a use case we should care for? Anything else that I missed?

gmilde avatar Jun 13 '25 14:06 gmilde