tyxml
tyxml copied to clipboard
Should tyxml percent-encode urls ?
Code like this:
p [ a ~a:[ a_href "#val-(^)" ] [ txt "val (<)" ] ];
p [ a ~a:[ a_href "#val-(<)" ] [ txt "val (<)" ] ];
p [ a ~a:[ a_href "#val-(>)" ] [ txt "val (<)" ] ];
Generates HTML like this:
<p><a href="#val-(^)">val (<)</a></p>
<p><a href="#val-(<)">val (<)</a></p>
<p><a href="#val-(>)">val (<)</a></p>
Of course, web browsers accept this but tidy-html5 generates warnings:
Warning: <a> escaping malformed URI reference
Warning: <a> illegal characters found in URI
Warning: <a> escaping malformed URI reference
Warning: <a> illegal characters found in URI
Should Tyxml percent-encode the value of href
attributes ?
Excellent question, I'm slightly surprised it does that. Do you know what the specs says ?
It seems that ()
and ^
are wrong in the url: https://url.spec.whatwg.org/#absolute-url-with-fragment-string
It also seems that HTML is directly linking to the URL spec (here) but seems to allow "Named character reference" everywhere (I couldn't find this explicitly but there doesn't seem to be any exception).
I found a HTML4 reference that allows them explicitly.