sitemap
sitemap copied to clipboard
Locale handling complete?
Hi,
great plugin, has worked perfectly for me so far!
One question about multi-locale handling: I noticed that the generated sitemap for a multi-locale sites refers to other (secondary?) locales via rel=alternate. But according to this article from the google search console, each locale version of a specific page must have it's own <loc>-tag additionally (check out the example there).
So unless I am missing something, the sitemap generated by the plugin does contain all pages from one locale, and their references to other locales but not the other locales themselves.
Can someone confirm this?
Looking through the models, Sitemap_AlternateUrlModel::getDomElement looks like it should be output as expected (i.e. separate nodes within the same <loc>).
Are you seeing something different? Could you share the output you’re getting?
On 18 Dec 2015, at 14:41, Benjamin Grössing [email protected] wrote:
Hi,
great plugin, has worked perfectly for me so far!
One question about multi-locale handling: I noticed that the generated sitemap for a multi-locale sites refers to other (secondary?) locales via rel=alternate. But according to this article from the google search console https://support.google.com/webmasters/answer/2620865?hl=en, each locale version of a specific page must have it's own
-tag additionally (check out the example there). So unless I am missing something, the sitemap generated by the plugin does contain all pages from one locale, and their references to other locales but not the other locales themselves.
Can someone confirm this?
— Reply to this email directly or view it on GitHub https://github.com/joshuabaker/craft-sitemap/issues/17.
Yes, it outputs multiple xhtml:link-tags within the same loc-tag. For instance, this is my output for 1 content page in 2 languages:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>http://localhost/de/b/foobar-german</loc>
<lastmod>2015-12-22T17:08:50+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="de" href="http://localhost/de/b/foobar-german"/>
<xhtml:link rel="alternate" hreflang="en" href="http://localhost/en/b/foobar-english"/>
</url>
</urlset>
But according to this Google Search Console Help Article, this is not enough.
In addition to the xhtml:link-tags every localized version of a page must have its own url/loc-tag as well.
In the example above, the correct output should therefore be:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>http://localhost/de/b/foobar-german</loc>
<lastmod>2015-12-22T17:08:50+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="de" href="http://localhost/de/b/foobar-german"/>
<xhtml:link rel="alternate" hreflang="en" href="http://localhost/en/b/foobar-english"/>
</url>
<!-- the same block again, but now with the english URL in the <loc>-tag -->
<url>
<loc>http://localhost/en/b/foobar-english</loc>
<lastmod>2015-12-22T17:08:50+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="de" href="http://localhost/de/b/foobar-german"/>
<xhtml:link rel="alternate" hreflang="en" href="http://localhost/en/b/foobar-english"/>
</url>
</urlset>
Otherwise these sites might not get indexed by search engines.
You're right. That output is incorrect. I'll need to review when I get some time.
Thanks for reporting and following up.
On 22 Dec 2015, at 5:17 pm, Benjamin Grössing [email protected] wrote:
Yes, it outputs multiple xhtml:link-tags within the same loc-tag. For instance, this is my output for 1 content page in 2 languages:
But according to this Google Search Console Help Article, this is not enough. http://localhost/de/b/foobar-german 2015-12-22T17:08:50+00:00 weekly 0.5 In addition to the xhtml:link-tags every localized version of a page must have its own url/loc-tag as well.
In the example above, the correct output should therefore be:
Otherwise these sites might not get indexed by search engines. http://localhost/de/b/foobar-german 2015-12-22T17:08:50+00:00 weekly 0.5 http://localhost/en/b/foobar-english 2015-12-22T17:08:50+00:00 weekly 0.5 — Reply to this email directly or view it on GitHub.
Thanks a lot, @joshuabaker, that would be awesome!
Let me know if I can help.
@joshuabaker This fixes it (also shouldn't break any sites that use multiple domains): #18.
Would love to see that merged in :)
@groe I see that you have a updated fork. However, I don't understand the pull request from @dommmel?
If CRAFT_LOCALE is set, it still has the old behaviour, which leads to invalid sitemap? Shouldn't we get rid of that check and include them regardless?
@sjelfull Why would it lead to an invalid sitemap?
@groe As you mentioned at the start:
So unless I am missing something, the sitemap generated by the plugin does contain all pages from one locale, and their references to other locales but not the other locales themselves.
This happens if CRAFT_LOCALE is set, and you always(?) set CRAFT_LOCALE on a multi-locale site.
That is why the conditional that checks for CRAFT_LOCALE doesn't make sense to me.
@sjelfull Depending on how your site is structured the /sitemap.xml endpoint can be called without CRAFT_LOCALE being set. For instance, if you have a single domain for all languages you could have a URL structure like:
- example.org/en/... – CRAFT_LOCALE = en
- example.org/de/... – CRAFT_LOCALE = de
- example.org/sitemap.xml – CRAFT_LOCALE is not set
Whereas if you have multiple domains you could set it up like:
- example.us/... – CRAFT_LOCALE = en
- example.us/sitemap.xml – CRAFT_LOCALE = en
- example.de/... – CRAFT_LOCALE = de
- example.de/sitemap.xml – CRAFT_LOCALE = de
... which would have each domain's sitemap only include the links for the specific locale.
Formatting is incorrect when there is more than one locale. Any update on that?
@nlussier-globalia What exactly do you mean?
The browser is unable to interpret it, so all you see is the plain output (you don't see the tree).
The view source seems correct though (identical to another site that has only one language).