zulip-archive icon indicating copy to clipboard operation
zulip-archive copied to clipboard

README.md: Add step that explains how to use the sitemap.xml properly

Open refeed opened this issue 3 years ago • 6 comments

Followup of 980675c6bb268d7cd222aa84cdc11446de3ba99a

refeed avatar Sep 10 '22 03:09 refeed

Is there a way to automate adding to robots.txt, so that we don't even need to document this?

rht avatar Sep 10 '22 04:09 rht

Umm, I haven't heard such thing existed yet

refeed avatar Sep 10 '22 05:09 refeed

This can be quickly tested by committing robots.txt with the sitemap info to a zulip-archive repo root.

rht avatar Sep 10 '22 05:09 rht

See e.g. https://github.com/twbs/bootstrap/blob/gh-pages/robots.txt.

rht avatar Sep 10 '22 06:09 rht

Yes that is possible if the zulip-archive lives right in the root of the URL, e.g. https://example.github.io .

But for github pages especially those that don't use custom domain, most of the case zulip-archive doesn't live in the root of the domain e.g. https://example.github.io/zulip-archive/ In this case, placing a robots.txt here will make it appear in https://example.github.io/zulip-archive/robots.txt which won't be accessed by search engine bots

refeed avatar Sep 10 '22 15:09 refeed

The ones with custom domain should work, because that Bootstrap GH pages domain is getbootstrap.com, and they have robots.txt committed in the repo root.

The case where it doesn't work is when the subdomain of the GH pages URL is not the repo name https://example.github.io/zulip-archive/. In this case, then OK, this should be documented. But the rest can be automated.

rht avatar Sep 10 '22 20:09 rht