zulip-archive
zulip-archive copied to clipboard
README.md: Add step that explains how to use the sitemap.xml properly
Followup of 980675c6bb268d7cd222aa84cdc11446de3ba99a
Is there a way to automate adding to robots.txt, so that we don't even need to document this?
Umm, I haven't heard such thing existed yet
This can be quickly tested by committing robots.txt with the sitemap info to a zulip-archive repo root.
See e.g. https://github.com/twbs/bootstrap/blob/gh-pages/robots.txt.
Yes that is possible if the zulip-archive lives right in the root of the URL, e.g. https://example.github.io .
But for github pages especially those that don't use custom domain, most of the case zulip-archive doesn't live in the root of the domain e.g. https://example.github.io/zulip-archive/
In this case, placing a robots.txt here will make it appear in https://example.github.io/zulip-archive/robots.txt which won't be accessed by search engine bots
The ones with custom domain should work, because that Bootstrap GH pages domain is getbootstrap.com, and they have robots.txt committed in the repo root.
The case where it doesn't work is when the subdomain of the GH pages URL is not the repo name https://example.github.io/zulip-archive/. In this case, then OK, this should be documented. But the rest can be automated.