docusaurus icon indicating copy to clipboard operation
docusaurus copied to clipboard

SEO issue: do not use useLocation() to compute canonical urls

Open slorber opened this issue 2 years ago • 9 comments

Have you read the Contributing Guidelines on issues?

Prerequisites

  • [X] I'm using the latest version of Docusaurus.
  • [X] I have tried the npm run clear or yarn clear command.
  • [X] I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • [X] I have tried creating a repro with https://new.docusaurus.io.
  • [X] I have read the console error message carefully (if applicable).

Description

The way we compute the canonical url today:

function useDefaultCanonicalUrl() {
  const {
    siteConfig: {url: siteUrl, baseUrl, trailingSlash},
  } = useDocusaurusContext();
  const {pathname} = useLocation();
  const canonicalPathname = applyTrailingSlash(useBaseUrl(pathname), {
    trailingSlash,
    baseUrl,
  });
  return siteUrl + canonicalPathname;
}

Using useLocation().pathname works in most cases but it is a bad idea because it is a dynamic value that depends on the current browser URL. This means the static canonical URL might be ok in the html files, but once React hydrates, the canonical URL is updated to something else that can depend on the browser URL.

Notably, if you use your CDN/reverse proxy to configure aliases, if a doc exists at /doc1 and you also make it available at /doc1alias, then if you go to /doc1alias and after React hydrates, the canonical URL will be /doc1alias instead of /doc1 (ie 2 canonical URLs for the same doc).

I'm not sure it's a big deal for SEO, considering crawlers probably try to extract the static canonical URL in the page which is correct before React hydration, but we should still rather try to find a solution.

Note doing such reverse proxy alias might be common, and we also discuss it as part of this issue as a good solution if you want to have docs version aliases: see also https://github.com/facebook/docusaurus/issues/9049

Similarly, hreflang values depend on useLocation and can be wrong on aliased documents.

Related to https://github.com/facebook/docusaurus/issues/9128

Reproducible demo

No response

Steps to reproduce

We don't have any doc alias in our prod website, but the 404 case is a great example.

Take a look at https://docusaurus.io/not/found/path

  • before hydration, canonical URL is https://docusaurus.io/404.html
  • after hydration, canonical URL is https://docusaurus.io/not/found/path

Expected behavior

The canonical url, hreflang and other metadata using pathname should always be the same before/after React hydration

Actual behavior

The values are different before/after hydration

Your environment

No response

Self-service

  • [X] I'd be willing to fix this bug myself.

slorber avatar Jul 21 '23 14:07 slorber

@slorber I want to work on the task but does front matter plugin support aliases?

prathamVaidya avatar Dec 04 '23 07:12 prathamVaidya

@slorber I want to work on the task but does front matter plugin support aliases?

I have no idea what you mean or why you ask this question sorry

slorber avatar Dec 15 '23 11:12 slorber

@slorber As per my understanding, the issue says to update logic where canonical url is set using useLocation because it change depending on page url after hydration. If we are using canonical url then I expect there can be multiple aliases for a url. For example for docs, if there is a doc at 'doc1' and it also has a alias named 'doc1alias' .

  1. So my first question I looked for the documentation but I can't find a feature through which I can add a slug or url alias in a document.
  2. If there is no way to add an alias yet, then how are there multiple URLs for a page. (Not Including /404 page)

I hope I didn't confuse you this time 😅

prathamVaidya avatar Dec 15 '23 13:12 prathamVaidya

Sorry, my misunderstanding was you mentioning a "front matter plugin", which doesn't exist


It's explained in the issue

Notably, if you use your CDN/reverse proxy to configure aliases, if a doc exists at /doc1 and you also make it available at /doc1alias, then if you go to /doc1alias and after React hydrates, the canonical URL will be /doc1alias instead of /doc1 (ie 2 canonical URLs for the same doc).

slorber avatar Dec 15 '23 15:12 slorber