django-robots icon indicating copy to clipboard operation
django-robots copied to clipboard

Multilanguage and sitemaps not properly supported

Open kennylajara opened this issue 3 years ago • 1 comments

Google and other search engines just look for the robots.txt file on the root directory.

http//example.com/folder/robots.txt Not a valid robots.txt file. Crawlers don't check for robots.txt files in subdirectories. Google documentation.

That means that the robots.txt for the non-default language is being ignored. ie: http://example.com/fr/robots.txt

So, the default behavior should be that the robots.txt files ignore any language configuration so they appear on the root file and all the sitemaps should be included on the robots.txt of the root.


Note: I found you allow the developers to change the settings so I can add the sitemaps by myself, and having the robots.txt on the folders won't hurt my SEO, it is just unnecessary. I am just letting you know because probably many developers using the app without figuring this out. By the way, thank you for the nice tool.

kennylajara avatar Feb 25 '22 06:02 kennylajara

Here's an example code to allow Multilanguage and sitemaps in the robots.txt file for django-robots:

  1. To add support for multilanguage robots.txt files, you can create a custom view that dynamically generates the robots.txt file based on the user's browser language preference. This view can then be included in your URLconf, so that it responds to requests for /robots.txt.
from django.views.generic import View
from django.http import HttpResponse

class RobotsTxtView(View):
    content_type = 'text/plain'

    def get(self, request, args, **kwargs):
        lang = request.LANGUAGE_CODE
        if lang == 'en':
            content = self.get_english_robots_txt()
        elif lang == 'fr':
            content = self.get_french_robots_txt()
        else:
            content = self.get_default_robots_txt()
        return HttpResponse(content, content_type=self.content_type)

    def get_english_robots_txt(self):
        # Modify this to generate your English robots.txt file
        return 'User-agent:\nDisallow: /'

    def get_french_robots_txt(self):
        # Modify this to generate your French robots.txt file
        return 'User-agent:\nDisallow: /fr/'

    def get_default_robots_txt(self):
        # Modify this to generate your default robots.txt file
        return 'User-agent:\nDisallow: /'

# in your urls.py
from .views import RobotsTxtView

urlpatterns = [
    # ... your other url patterns
    path('robots.txt', RobotsTxtView.as_view(), name='robots_txt'),
]
  1. To include sitemap URL(s) in your robots.txt file, you can use the following in your view to generate a dynamic robots.txt file that includes the sitemap URLs:
from django.views.generic import View
from django.http import HttpResponse
from django.urls import reverse

class RobotsTxtView(View):
    content_type = 'text/plain'

    def get(self, request, args, **kwargs):
        sitemap_url = request.build_absolute_uri(reverse('django.contrib.sitemaps.views.sitemap'))
        content = f'User-agent: *\nDisallow:\nSitemap: {sitemap_url}\n'
        return HttpResponse(content, content_type=self.content_type)

# in your urls.py
from .views import RobotsTxtView

urlpatterns = [
    # ... your other url patterns
    path('robots.txt', RobotsTxtView.as_view(), name='robots_txt'),
]

This view generates a robots.txt file that disallows nothing (i.e., allows all robots) and includes the sitemap URL(s) using the Sitemap directive. The build_absolute_uri() method is used to generate a full URL for the sitemap view.

You can modify the content variable to suit your needs, for example, if you want to include multiple sitemaps, you can include multiple Sitemap` directives separated by newlines.

some1ataplace avatar Mar 27 '23 21:03 some1ataplace