tyxml icon indicating copy to clipboard operation
tyxml copied to clipboard

Should tyxml percent-encode urls ?

Open Julow opened this issue 4 years ago • 2 comments

Code like this:

p [ a ~a:[ a_href "#val-(^)" ] [ txt "val (<)" ] ];
p [ a ~a:[ a_href "#val-(<)" ] [ txt "val (<)" ] ];
p [ a ~a:[ a_href "#val-(>)" ] [ txt "val (<)" ] ];

Generates HTML like this:

<p><a href="#val-(^)">val (&lt;)</a></p>
<p><a href="#val-(&lt;)">val (&lt;)</a></p>
<p><a href="#val-(&gt;)">val (&lt;)</a></p>

Of course, web browsers accept this but tidy-html5 generates warnings:

Warning: <a> escaping malformed URI reference
Warning: <a> illegal characters found in URI
Warning: <a> escaping malformed URI reference
Warning: <a> illegal characters found in URI

Should Tyxml percent-encode the value of href attributes ?

Julow avatar Apr 01 '20 16:04 Julow

Excellent question, I'm slightly surprised it does that. Do you know what the specs says ?

Drup avatar Apr 10 '20 15:04 Drup

It seems that () and ^ are wrong in the url: https://url.spec.whatwg.org/#absolute-url-with-fragment-string It also seems that HTML is directly linking to the URL spec (here) but seems to allow "Named character reference" everywhere (I couldn't find this explicitly but there doesn't seem to be any exception). I found a HTML4 reference that allows them explicitly.

Julow avatar Apr 10 '20 16:04 Julow