sphinx-sitemap icon indicating copy to clipboard operation
sphinx-sitemap copied to clipboard

NEW: Added sitemap_suffix_included to better work with Cloudflare Pages and search engines

Open lextm opened this issue 1 year ago • 4 comments

When hosting a Sphinx project on Cloudflare Pages with default suffix .html, a very annoying fact is that the Cloudflare platform generates 301 responses to remove the suffix.

Search engines (especially Google) dislike such redirection and refuse to index such pages, and that makes the generated sitemap less useful for SEO.

Thus, this pull request proposes a new setting sitemap_suffix_included to control whether .html should be written to sitemap.xml. The default value is set to True to keep current behavior. When False is set, the generated sitemap.xml works well with Cloudflare and SEO.

lextm avatar Mar 26 '24 06:03 lextm

Thanks for the PR! I think this approach works, the other would be to add the file suffix to the URL scheme, but that would be a breaking change for anyone not using the default schema and don't think that is worth a major bump at this point. (If only I had the hindsight for a better default scheme from the beginning)

I can't cut a release for a couple weeks, but will as soon as I have the time to respond to any surprise issues, should they arise (don't expect any though).

a very annoying fact is that the Cloudflare platform generates 301 responses to remove the suffix.

Annoying indeed, I guess I'm old school but don't understand the disdain for the .html extension.

jdillard avatar Mar 26 '24 17:03 jdillard

@jdillard Thanks for the comments. No rush to include this I think and my team can stick to our own fork.

Cloudflare does not only dislike the .html extension, but also remove default from the end of the URLs. It might make some sense from SEO perspective, but just bring difficulty to sphinx site owners.

lextm avatar Apr 11 '24 21:04 lextm

@lextm Just curious, would using the dirhtml builder work in your case? It changes the build structure to remove the need for .html and this extension already supports the dirhtml builder. If that is the case I might just need to add documentation about using that builder in this kind of scenario.

jdillard avatar Apr 15 '24 19:04 jdillard

@jdillard I will give that a try then.

lextm avatar Apr 15 '24 19:04 lextm