htmly icon indicating copy to clipboard operation
htmly copied to clipboard

Special characters in URLs are not XML-converted when generating the sitemaps

Open nanawel opened this issue 2 years ago • 3 comments

If you have a post with a URL like, let's say, something-&-some-other-thing, the generated XML for the sitemaps is invalid because the & is not converted to its valid XML-safe equivalent %26 (or &, that should work here too).

It seems to be normal since the URL is directly injected in the XML without any conversion. See for example: https://github.com/danpros/htmly/blob/53db3bdb0dcb675870ad98f8ab9d4bb4e1628a92/system/includes/functions.php#L2514

I suppose a quick fix might be to use htmlentities($p->url, ENT_XML1, 'UTF-8') instead.

A more robust way would be to rewrite the sitemaps generation and use a XML library instead of building it from text blocks, but that's a whole different scope.

I don't have the time to propose a fully-tested PR for now, but I'll take a look if I can.

nanawel avatar Sep 25 '21 14:09 nanawel

how did you generate the sitemap?

a11y-bit avatar Jan 24 '23 08:01 a11y-bit

how did you generate the sitemap?

Well, I never did. It's just the default generation provided by HTMLy when opening <base_url>/sitemap.xml.

nanawel avatar Jan 24 '23 08:01 nanawel

thanks

a11y-bit avatar Jan 24 '23 11:01 a11y-bit