geowombat icon indicating copy to clipboard operation
geowombat copied to clipboard

Google SEO

Open mmann1123 opened this issue 1 year ago • 8 comments

We should strongly consider including the google analytics SEO and sitemap. These should make the site much easier to find through search engines. Plus you can obsessively check how many people are looking at the site at any given moment.

I am not sure how much of this has been done, but I will walk you through it just in case:

  1. Register site with google search console https://search.google.com/search-console/about
  • This requires you to generate a sitemap.xml and host it along side the website in the root directory. - This can be generated with https://pypi.org/project/sphinx-sitemap/ although since you are publishing through CI (i think), I am not entirely sure how CI would push the sitemap.xml to your hosting service - Search console will confirm things once it finds sitemap.xml, just make sure it has all your pages listed. - Whenever you add a new page to your docs you need to create a new sitemap.xml to get it crawled.
  • You might also register with bing (https://www.bing.com/webmasters/help/add-and-verify-site-12184f8b) should be easy enough, and the bing chat these days, it might be worth it.
  1. If you want more detailed analytics you also might look into adding a google analytics ID
  • Once you are registered with search console create:
    • a new Google Analytics 4 property instructions here
    • add a web data stream
    • retrieve your analytics code something like (G-MQPNBR9XWX)
    • add your tag to the google analytics script into the head of every page using https://pypi.org/project/sphinxcontrib-googleanalytics/

I am happy to help just let me know.

mmann1123 avatar Mar 30 '23 13:03 mmann1123

@mmann1123 do you want to get this in now, or do you want to review #252?

jgrss avatar Mar 31 '23 20:03 jgrss

@jgrss Happy to review 252 without this. Although it looks like CI tests are failing.

mmann1123 avatar Apr 01 '23 02:04 mmann1123

I am going to assume that you want ownership of this. Unless I am told otherwise- since its associated with a particular google account.

In particular if you can get the site registered:

  1. Register site with google search console https://search.google.com/search-console/about

I can help with the google analytics side of thing.

mmann1123 avatar Apr 14 '23 01:04 mmann1123

Where is the Google HTML verification file supposed to live?

jgrss avatar Apr 14 '23 03:04 jgrss

  1. go to google search console https://search.google.com/search-console/

  2. add new "property" with "URL prefix" as suggested

  3. Download the googlea<id>.html file and place it in docs/source (e.g. module/docs/source). Use html_extra_path inside of conf.py as follows (i.e. add the following line somewhere in conf.py):

html_extra_path = ["googlea<id>.html"]

  1. for sphinx
make clean
make html

and check to make sure docs/build/html/googlea<id>.html exists.

  1. Commit, push, and wait for your website to update (this could take a few minutes)
  2. If you have the webpage from step 1 still open, click VERIFY, otherwise navigate back to Google Search Console, and re-enter your URL (e.g. https://geowombat.readthedocs.io/en/latest). If it worked, it should tell you so, and you can then "Go to property".

Then you need to sort out the sitemap.xml

A. To do this properly it requires you to generate a sitemap.xml and host it along side the website in the root directory.

  • This can be generated with https://pypi.org/project/sphinx-sitemap/ although since you are publishing through CI (i think), I am not entirely sure how CI would push the sitemap.xml to your hosting service. But it likely does it automatically.

B. Once the sitemap is published to the web, open https://geowombat.readthedocs.io/sitemap.xml in your browser, it should have a list of ALL the pages making up your webpage. CHECK that they are valid URLS ie https://geowombat.readthedocs.io/en/latest/tutorial.html not https://geowombat.readthedocs.ioen/latest/tutorial.html .

  • If your URLS are missing / after https://geowombat.readthedocs.io check that in your sphinx config.py that you have a / at the end of your html_baseurl for instance https://geowombat.readthedocs.io/ not https://geowombat.readthedocs.io

C. Go to google search console again. Hit the sitemaps tab, and paste the full URL to your sitemap ie. https://geowombat.readthedocs.io/sitemap.xml

mmann1123 avatar Apr 14 '23 04:04 mmann1123

Steps 1-3 are done in #262.

jgrss avatar Apr 14 '23 23:04 jgrss

@jgrss can we close this?

mmann1123 avatar Sep 28 '23 17:09 mmann1123

oops accidentally closed

mmann1123 avatar Sep 28 '23 17:09 mmann1123