zulip-archive README.md: Add step that explains how to use the sitemap.xml properly

README.md: Add step that explains how to use the sitemap.xml properly

Open refeed opened this issue 3 years ago • 6 comments

Followup of 980675c6bb268d7cd222aa84cdc11446de3ba99a

Sep 10 '22 03:09 refeed

Is there a way to automate adding to robots.txt, so that we don't even need to document this?

Sep 10 '22 04:09 rht

Umm, I haven't heard such thing existed yet

Sep 10 '22 05:09 refeed

This can be quickly tested by committing robots.txt with the sitemap info to a zulip-archive repo root.

Sep 10 '22 05:09 rht

See e.g. https://github.com/twbs/bootstrap/blob/gh-pages/robots.txt.

Sep 10 '22 06:09 rht

Yes that is possible if the zulip-archive lives right in the root of the URL, e.g. https://example.github.io .

But for github pages especially those that don't use custom domain, most of the case zulip-archive doesn't live in the root of the domain e.g. https://example.github.io/zulip-archive/ In this case, placing a robots.txt here will make it appear in https://example.github.io/zulip-archive/robots.txt which won't be accessed by search engine bots

Sep 10 '22 15:09 refeed

The ones with custom domain should work, because that Bootstrap GH pages domain is getbootstrap.com, and they have robots.txt committed in the repo root.

The case where it doesn't work is when the subdomain of the GH pages URL is not the repo name https://example.github.io/zulip-archive/. In this case, then OK, this should be documented. But the rest can be automated.

Sep 10 '22 20:09 rht

zulip-archive zulip-archive copied to clipboard

README.md: Add step that explains how to use the sitemap.xml properly

zulip-archive
zulip-archive copied to clipboard