wagtail icon indicating copy to clipboard operation
wagtail copied to clipboard

PageQuerySet.in_site and Sitemap do not return pages in other languages

Open bmihelac opened this issue 1 year ago • 2 comments

Issue Summary

When Wagtail is configured for multi-language content, Page.objects.in_site(site) returns only pages in the language defined as the Site.root_page.

Similarly, the Sitemap, which does not use the in_site queryset method, also does not include translated pages.

I had thought that perhaps the language root page should be a descendant of the root page, but it seems this is not the case, as per the docs:

Wagtail stores content in a separate page tree for each locale. For example, if you have two sites in two locales, then you will see four homepages at the top level of the page hierarchy in the explorer.

Steps to Reproduce

  1. In multi-language site shell run:
from wagtail.models import Site, Page
site = Site.objects.select_related("root_page").get(is_default_site=True)
qs = Page.objects.in_site(site)
  1. qs would contain only pages in single language.

Technical details

  • Python version: Run python --version.

Python 3.12.6

  • Django version: Look in your requirements.txt, or run pip show django | grep Version.

Django==4.2.15

  • Wagtail version: Look at the bottom of the Settings menu in the Wagtail admin, or run pip show wagtail | grep Version:.

wagtail==6.2

  • Browser version: You can use https://www.whatsmybrowser.org/ to find this out.

Not browser related.

Working on this

Updating the in_site method below would handle multi-lingual pages as well. The Sitemap could also be updated to use in_site.

I can create a PR.

    def in_site(self, site):
        """
        This filters the QuerySet to only contain pages within the specified site.
        """
        from functools import reduce

        root_page_and_translations = site.root_page.get_translations(inclusive=True)
        all_descendants_q = reduce(
            lambda x, y: x | y, [self.descendant_of_q(p, inclusive=True) for p in root_page_and_translations ]
        )
        return self.filter(all_descendants_q)

bmihelac avatar Oct 16 '24 16:10 bmihelac

https://www.mashandgravy.co.uk/blog/google-friendly-sitemaps-multilingual-wagtail-sites/ may be of use here.

Btw, this is not a bug as translations are created in their corresponding locale trees, so your Site query only picks up the default site pages (ie. your source language)

zerolab avatar Oct 17 '24 08:10 zerolab

@zerolab Thanks for the quick response and the useful link. The Girls Not Brides website mentioned in the article is a bit specific because it has separate website for each language.

After reading both the article and the Google documentation, I am confident that the sitemap should include all pages that are part of the same site.

For example, let's say there is only one page in two languages on a website:

  • https://www.example.com/en/
  • https://www.example.com/de/

The sitemap should contain both pages (currently it does not):

<url><loc>https://www.example.com/en/</loc>
<url><loc>https://www.example.com/de</loc>

If the sitemap includes information about localized pages, it should still have a and tag for each page on the site, whether the page is a translation or not:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>http://www.example.com/en/</loc>
    <xhtml:link 
               rel="alternate"
               hreflang="de"
               href="http://www.example.com/de/"/>
    <xhtml:link 
               rel="alternate"
               hreflang="en"
               href="http://www.example.com/en/"/>
  </url>
  <url>
    <loc>http://www.example.com/de/</loc>
    <xhtml:link 
               rel="alternate"
               hreflang="de"
               href="http://www.example.com/de/"/>
    <xhtml:link 
               rel="alternate"
               hreflang="en"
               href="http://www.example.com/en/"/>
  </url>

This aligns with the Google documentation:

https://developers.google.com/search/docs/specialty/international/localized-versions?hl=en&visit_id=637509921562028966-2232430485&rd=2#sitemap

bmihelac avatar Oct 17 '24 09:10 bmihelac

Flagging as a documentation improvement for now.

lb- avatar Feb 23 '25 22:02 lb-