hackage-server icon indicating copy to clipboard operation
hackage-server copied to clipboard

Sitemap logic needs to conform to specs.

Open chrissound opened this issue 8 years ago • 7 comments

2017-05-30-22 04 31

A quick google search for "haskell persistent insert" seems to return the docs for v0.3.1.3 (latest is v2.7).

I think if we set a higher priority for the latest version in the sitemap.xml - Google will pick this up and have the latest version higher up in the search results.

chrissound avatar May 30 '17 21:05 chrissound

Here's what the sitemap says currently:

  <url>
    <loc>https://hackage.haskell.org/package/persistent/docs</loc>
    <lastmod>2017-04-17</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://hackage.haskell.org/package/persistent-0.3.1.3/docs</loc>
    <lastmod>2017-04-17</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.25</priority>
  </url>

So it does already use a higher priority for the unversioned URL which always points to the last version...

hvr avatar May 30 '17 22:05 hvr

Hmm. The v2.7 entry is at a priority of 0.25 though.

(Please excuse the strange formatting).

><priority
>0.25</priority
></url
><url
><loc
>https://hackage.haskell.org/package/persistent-2.7.0/docs</loc

chrissound avatar May 31 '17 07:05 chrissound

True, but if google already now doesn't appear to prefer persistent/docs over persistent-0.3.1.3/docs, why would it make a difference if the priority for persistent-2.7.0/docs would be inbetweeen? (I'm not saying we shouldn't try it, I'm just trying to understand why you think it would make a difference).

hvr avatar May 31 '17 08:05 hvr

I'm honestly not too sure, I'm just taking a guess here.

chrissound avatar May 31 '17 16:05 chrissound

I finally got us set up on the search management console and discovered that our sitemap is rejected by google :-/

"Your Sitemap contains too many URLs. Please create multiple Sitemaps with up to 50000 URLs each and submit all Sitemaps."

So we'll need to rethink some logic here.

gbaz avatar Mar 19 '18 17:03 gbaz

actually its not rejected -- it just only accepts that first chunk of urls, which is the majority...

gbaz avatar Mar 20 '18 08:03 gbaz

we can do multiple sitemaps as per: https://support.google.com/webmasters/answer/75712

Also sitemaps should have a link for every url, not just subdirectories.

gbaz avatar Mar 25 '18 07:03 gbaz