MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Unable to link with multiple query params

Open JonZeolla opened this issue 1 year ago • 9 comments

Describe the bug

context I am trying to make a link containing query params such as https://example.com?first=a&second=b

expectation I expected that the rendered link would contain all of the query params.

bug But instead the & becomes encoded as & and breaks the link. I thought I'd be able to escape the & or that it would work natively, but neither seem to be true.

Reproduce the bug

[example][EXAMPLE]

[EXAMPLE]: https://example.com?first=a&second=b "Example"

and

[example][EXAMPLE]

[EXAMPLE]: https://example.com?first=a\&second=b "Example"

Renders:

<a class="reference external" href="https://example.com?first=a&amp;second=b" title="Example">example</a>

List your environment

Using MyST-Parser: 1.0.0

JonZeolla avatar Apr 18 '23 00:04 JonZeolla

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

welcome[bot] avatar Apr 18 '23 00:04 welcome[bot]

Reported also here: https://bugzilla.mozilla.org/show_bug.cgi?id=1828390

sylvestre avatar May 22 '23 07:05 sylvestre

Heya, yep this is consistent with the CommonMark spec:

  • https://spec.commonmark.org/0.30/#example-502
  • https://spec.commonmark.org/dingus/?text=%5Bexample%5D%5BEXAMPLE%5D%0A%0A%5BEXAMPLE%5D%3A%20https%3A%2F%2Fexample.com%3Ffirst%3Da%26second%3Db%20%22Example%22%0A

and also I believe with the HTML spec: https://stackoverflow.com/a/3705601/5033292

are you sure it is actually breaking the link, it doesn't appear to break for me in Firefox?

chrisjsewell avatar Jun 06 '23 07:06 chrisjsewell

It will load the page but the query param isn't properly sent to the server, so links like this turn into this and don't process the start time.

JonZeolla avatar Jun 06 '23 16:06 JonZeolla

What you said isn't quite correct; it turns it into <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ&amp;t=2m12s">this</a> (not with the backslash escapes you put in your 2nd link above), which renders as this and any browser I try this link on correctly processes the start time (and gives me a musical surprise lol)

chrisjsewell avatar Jun 07 '23 00:06 chrisjsewell

@chrisjsewell yeah, the backslash escapes were for GitHub rendering to make a link that matches when you click it (if you look at the resulting href on GitHub it is accurate)

Although testing on my phone just now it works as I would want it to, but that’s with the YouTube app handling it, not a browser

JonZeolla avatar Jun 07 '23 10:06 JonZeolla

I am having the exact same issue. For https://example.com?first=a&second=b, the api service I am using complains that the second parameter is amp;second, which it expects to be second. Is there any way to let myst not encode the url?

Update: I ended up manually using sed to find and remove the amp;.

kan-fu avatar Nov 25 '23 00:11 kan-fu

Same here. Markdown-it-py 3.0.0 generates the correct url with & encoded as &amp;

$ markdown-it <(echo "[example](https://example.com?first=a&second=b)")
<p><a href="https://example.com?first=a&amp;second=b">example</a></p>

whereas myst-parser 2.0.0 yields a broken &amp;amp;

$ myst-docutils-demo <(echo "[example](https://example.com?first=a&second=b)")
<p><a class="reference external" href="https://example.com?first=a&amp;amp;second=b">example</a></p>

upekkha avatar Jan 26 '24 12:01 upekkha

A potential solution would be to adjust the MyST-Parser code to prevent the initial escaping of the URL, allowing docutils or the final HTML writer to handle the necessary escaping. This could involve simply removing or modifying the call to escapeHtml in the MyST-Parser's conversion process.

diff --git a/myst_parser/mdit_to_docutils/base.py b/myst_parser/mdit_to_docutils/base.py
index 4a02459..cee67c9 100644
--- a/myst_parser/mdit_to_docutils/base.py
+++ b/myst_parser/mdit_to_docutils/base.py
@@ -996,7 +996,7 @@ class DocutilsRenderer(RendererProtocol):
             if "classes" in conversion:
                 ref_node["classes"].extend(conversion["classes"])
 
-        ref_node["refuri"] = escapeHtml(uri)
+        ref_node["refuri"] = uri
         if implicit_text is not None:
             with self.current_node_context(ref_node, append=True):
                 self.current_node.append(nodes.Text(implicit_text))

Additional Information:

The issue seems to stem from both the MyST-Parser and docutils/html writers, where the URL is being escaped more than once. https://github.com/docutils/docutils/blob/b768e2626088711dec257b0847b563d02700a712/docutils/docutils/writers/_html_base.py

$ echo "http://a.com?foo=1&bar=2" | rst2html.py | grep foo
<p><a class="reference external" href="http://a.com?foo=1&amp;bar=2">http://a.com?foo=1&amp;bar=2</a></p>

shimizukawa avatar Feb 18 '24 08:02 shimizukawa