next.js icon indicating copy to clipboard operation
next.js copied to clipboard

Next 13 - Sitemap can't fetch on Google Search Console

Open anthonyjacquelin opened this issue 2 years ago • 48 comments

Verify canary release

  • [X] I verified that the issue exists in the latest Next.js canary release

Provide environment information

Operating System:
      Platform: darwin
      Arch: x64
      Version: Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64
    Binaries:
      Node: 18.16.0
      npm: 9.5.1
      Yarn: 1.22.19
      pnpm: 7.29.3
    Relevant packages:
      next: 13.4.6
      eslint-config-next: 13.2.4
      react: 18.2.0
      react-dom: 18.2.0
      typescript: 4.9.5

Which area(s) of Next.js are affected? (leave empty if unsure)

App directory (appDir: true)

Link to the code that reproduces this issue or a replay of the bug

https://codesandbox.com

To Reproduce

export default async function sitemap() {
  const db = await connecToDatabase();
  const usersCollection = db.collection("Users");

  // articles
  const articles = await API.getPosts(
    "",
    undefined,
    undefined,
    "published"
  )
    .then((res) => res)
    .catch((error) => console.log("error fetching content"));
  const articleIds = articles?.map((article: Article) => {
    return { id: article?._id, lastModified: article?.createdAt };
  });
  const posts = articleIds.map(({ id, lastModified }) => ({
    url: `${URL}/${id}`,
    lastModified: lastModified,
  }));

  // users
  const profiles = await usersCollection.find({}).toArray();
  const users = profiles
    ?.filter((profile: User) => profile?.userAddress)
    ?.map((profile: User) => {
      return {
        url: `${URL}/profile/${profile.userAddress}`,
        lastModified: new Date().toISOString(),
      };
    });

  // tags
  const tagsFromDb = await articles
    ?.map((article: Article) => article?.categories)
    ?.flat();

  const uniqueTags = tagsFromDb.reduce((acc, tag) => {
    const existingTag = acc.find((item) => item.id === tag.id);

    if (!existingTag) {
      acc.push(tag);
    }

    return acc;
  }, []);

  const tags = uniqueTags
    ?.filter((tag) => tag?.id)
    ?.map((tag) => {
      return {
        url: `${URL}/tags/${tag.id}`,
        lastModified: new Date().toISOString(),
      };
    });

  const staticPages = [
    {
      url: `${URL}`,
      lastModified: new Date().toISOString(),
    },
    { url: `${URL}/about`, lastModified: new Date().toISOString() },
    { url: `${URL}/read`, lastModified: new Date().toISOString() },
  ];

  return [...posts, ...users, ...tags, ...staticPages];
}

Describe the Bug

Hello,

I'm using Next 13 with the /app directory and trying to configure the sitemap of my project on Google search console.

I have used the documentation as described there: Documentation

I have a sitemap.ts in the root of my /app directory, but it seems not recognized by GSC, and i know the sitemap is valid: URL and i've checked also using this tool

Xnapper-2023-06-22-13 15 56

Expected Behavior

I want the /sitemap.xml to be recognized by Google search console.

Which browser are you using? (if relevant)

No response

How are you deploying your application? (if relevant)

No response

anthonyjacquelin avatar Jun 22 '23 11:06 anthonyjacquelin

Do you have a file at app/robots.ts? See here for an example.

This file lets engines and crawlers know where to find your sitemap. You can read more about it here

SuttonJack avatar Jun 23 '23 02:06 SuttonJack

Do you have a file at app/robots.ts? See here for an example.

This file lets engines and crawlers know where to find your sitemap. You can read more about it here

same issue. I have enabled the sitemap and have added the following code to app/robots.ts but cannot register the sitemap.

import type { MetadataRoute } from "next"

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: "*",
      },
    ],
    sitemap: "https://my-url.xyz/sitemap.xml",
    host: "https://my-url.xyz",
  }
}

Maybe some more time needs to pass, so I'll give it a little more time.

ryuji-orca avatar Jun 23 '23 05:06 ryuji-orca

Have you tried putting it in the /public folder instead?

JasonA-work avatar Jun 26 '23 23:06 JasonA-work

Have you tried putting it in the /public folder instead?

I've tried, but no..., I don't think public has anything to do with it because the official and leerob sites put their sitemap and robots files in app/.

https://github.com/vercel/commerce/blob/70dcfa9736bb2067713a425e17ee6e59fb3fca2b/app/sitemap.ts#L8 https://github.com/leerob/leerob.io/blob/main/app/sitemap.ts

ryuji-orca avatar Jun 28 '23 10:06 ryuji-orca

Has someone solved this issue ? I'm still stuck on it without any pieces of possible solution...

anthonyjacquelin avatar Jun 29 '23 14:06 anthonyjacquelin

Also experiencing this. Putting a sitemap.xml file in public with appdir cannot be parsed by Google Search Console. Falling back to pages and following this older tutorial does work.

CJEnright avatar Jul 04 '23 03:07 CJEnright

Also experiencing this. Putting a sitemap.xml file in public with appdir cannot be parsed by Google Search Console. Falling back to pages and following this older tutorial does work.

I tried but even if my new sitemap is valid, nothing changed...

anthonyjacquelin avatar Jul 04 '23 13:07 anthonyjacquelin

Has someone solved this issue ? I'm still stuck on it without any pieces of possible solution...

i did as same document of next.js has wroten for robots.ts and sitemap.xml and has same problem

loverphp487 avatar Jul 09 '23 03:07 loverphp487

Has someone managed this error in any ways ? Still encounter the problem on my side

anthonyjacquelin avatar Jul 25 '23 17:07 anthonyjacquelin

Next 13.4.7 I have the same problem. I can see the sitemap.xml path from my browser as far as I can access it...

octane96 avatar Aug 08 '23 11:08 octane96

After saving the dynamically generated sitemap.xml from the browser and storing it in the public directory, Google Search Console was able to load it. I am not sure if this is a Next.js issue or a Google Search Console issue, but I will get by with this for now. Very disappointed...

octane96 avatar Aug 09 '23 04:08 octane96

After saving the dynamically generated sitemap.xml from the browser and storing it in the public directory, Google Search Console was able to load it.

I am not sure if this is a Next.js issue or a Google Search Console issue, but I will get by with this for now.

Very disappointed...

Thanks for your feedback, so at the end of the day this is not dynamic anymore...

anthonyjacquelin avatar Aug 09 '23 04:08 anthonyjacquelin

Thanks for your feedback, so at the end of the day this is not dynamic anymore...

That's right... I agree.

octane96 avatar Aug 09 '23 04:08 octane96

Thanks for your feedback, so at the end of the day this is not dynamic anymore...

That's right...

I agree.

So maybe we could create a cron api route that will write this sitemap.xml file every day or week using fs

anthonyjacquelin avatar Aug 09 '23 04:08 anthonyjacquelin

Thanks for your feedback, so at the end of the day this is not dynamic anymore...

That's right... I agree.

So maybe we could create a cron api route that will write this sitemap.xml file every day or week using fs

Thanks for the very good ideas! I've written a simple cron for now, so I'll get by with that for a while!

octane96 avatar Aug 09 '23 04:08 octane96

This is a bit off topic, but it seems that sitemap.ts is built static. Is that how it is supposed to be...?

If so, it does not have to be cron.

octane96 avatar Aug 09 '23 05:08 octane96

This is a bit off topic, but it seems that sitemap.ts is built static.

Is that how it is supposed to be...?

If so, it does not have to be cron.

I'm not sure that sitemap.xml has to be statically generated.

The most important thing is to have an up to date version of your sitemap if you have dynamic pages being created.

anthonyjacquelin avatar Aug 09 '23 05:08 anthonyjacquelin

The most important thing is to have an up to date version of your sitemap if you have dynamic pages being created.

I agree.

Sorry if I didn't communicate it well. I could see the build log at hand, which seems to be dynamically generated only at build time to begin with. I wish it would always be generated dynamically.

octane96 avatar Aug 09 '23 06:08 octane96

Hi guys!

I think it is not an error. Neither on Google nor Vercel. Better saying, I'm not sure it is not kinda an error on Google, because I really think it should have a better message to this situation. You can read further info about this in the link below: https://support.google.com/webmasters/thread/184533703/are-you-seeing-couldn-t-fetch-reported-for-your-sitemap?hl=en&sjid=15254935347152386554-SA

I spent 30 minutes searching on the web thinking it was a problem.

lucas-soler avatar Aug 10 '23 14:08 lucas-soler

Hi guys!

I think it is not an error. Neither on Google nor Vercel. Better saying, I'm not sure it is not kinda an error on Google, because I really think it should have a better message to this situation. You can read further info about this in the link below: https://support.google.com/webmasters/thread/184533703/are-you-seeing-couldn-t-fetch-reported-for-your-sitemap?hl=en&sjid=15254935347152386554-SA

I spent 30 minutes searching on the web thinking it was a problem.

If it is not a bug and just due to time needed for google to process the sitemap, all our sitemaps would have been handled by google after a while. The fact is that even after 1 month i still see "can't fetch".

So there might be a bigger problem than just a messy error message + time needed for google to handle it.

anthonyjacquelin avatar Aug 12 '23 17:08 anthonyjacquelin

I'm not using the app folder (my sitemap.xml is a simple public file) and I had the same issue.

After waiting for almost a month, I tried something else : I created sitemap2.xml and it fetched it successfully. Both files are identical...

Screenshot 2023-08-22 at 16 46 10

I think Google keeps some cache of a file and if it failed retrieving it once, it fails again and again. So probably not related to Next.js at all.

c100k avatar Aug 22 '23 14:08 c100k

It's been quite a while since I posted the first article, but it still hasn't been registered in the sitemap 😓.

The following issue states that after changing from <xml version="1.0" encoding="UTF-8">...</xml> to <?xml version="1.0" encoding="UTF-8"?>, it was reported that it got registered in the Search Console. However, in the case of the app's sitemap.xml, is it necessary to include the <?xml version="1.0" encoding="UTF-8"?> declaration? 🧐.

Google Article

I've also noticed someone on X experiencing the same error as me. However, it seems to be happening with Remix as well, so it might be a problem on Google's side.

ryuji-orca avatar Aug 26 '23 14:08 ryuji-orca

I end up create sitemap.xml file in public and copy pasted created sitemap.xml file through nextjs

Now google is able to find my sitemap.xml...

but i still want valid information too because i don't want copy paste all the time whenever my sitemap changes

seogki avatar Aug 29 '23 01:08 seogki

Anyone using the app directory managed to make it work ?

anthonyjacquelin avatar Sep 12 '23 17:09 anthonyjacquelin

After encountering the same issues as you had, in my case with "next": "13.4.19" App Router and their native solution for sitemap (https://nextjs.org/docs/app/api-reference/file-conventions/metadata/sitemap#generate-a-sitemap)

I found this article and applied the Next.js 13.2 and lower solution proposed by the article https://claritydev.net/blog/nextjs-dynamic-sitemap-pages-app-directory#nextjs-132-and-lower

What happened? The route app/sitemap.xml/route.ts didn't work, and I suspected it might be due to caching by Google...

...so I tried app/sitemap2.xml/route.ts, and it worked (yep, same code...)

Now, sitemap2.xml is working properly in Google Search Console.

My sitemap.xml is still available with the same code, but Search Console is unable to fetch it. I removed it and added it again, and it's still not working. So, my plan is to remove it for some days or weeks and then try adding it again. At least, I'm indexing with sitemap2.xml, which is dynamic.

iperdev avatar Sep 15 '23 11:09 iperdev

After encountering the same issues as you had, in my case with "next": "13.4.19" App Router and their native solution for sitemap (https://nextjs.org/docs/app/api-reference/file-conventions/metadata/sitemap#generate-a-sitemap)

I found this article and applied the Next.js 13.2 and lower solution proposed by the article https://claritydev.net/blog/nextjs-dynamic-sitemap-pages-app-directory#nextjs-132-and-lower

What happened? The route app/sitemap.xml/route.ts didn't work, and I suspected it might be due to caching by Google...

...so I tried app/sitemap2.xml/route.ts, and it worked (yep, same code...)

Now, sitemap2.xml is working properly in Google Search Console.

My sitemap.xml is still available with the same code, but Search Console is unable to fetch it. I removed it and added it again, and it's still not working. So, my plan is to remove it for some days or weeks and then try adding it again. At least, I'm indexing with sitemap2.xml, which is dynamic.

I tried that

export async function GET(req: NextRequest) {
  const sitemap: any = await getSitemap();

  const toXml = (urls: any) => `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="https://www.sitemaps.org/schemas/sitemap/0.9">
${urls
  .map((item: any) => {
    return `
<url>
    <loc>${item.url}</loc>
    <lastmod>${item.lastModified}</lastmod>
    <changefreq>${item.changeFrequency}</changefreq>
    <priority>${item.priority}</priority>
</url>
    `;
  })
  .join('')}
</urlset>`;

  return new Response(toXml(sitemap), {
    status: 200,
    headers: {
      'Cache-control': 'public, s-maxage=86400, stale-while-revalidate',
      'content-type': 'application/xml'
    }
  });
}

But google can't find it, I also tried the sitemap2.xml inside the public folder and I got the same error.

felri avatar Sep 16 '23 18:09 felri

Have the same issue with pages router. My sitemap google can't fetch at least 6 months. I tried with sitemap2.xml inside the public folder and it doesn't work too.

Does someone have successfully experience with adding sitemap and pages router?

didyk avatar Oct 06 '23 08:10 didyk

https://www.jcchouinard.com/sitemap-could-not-be-read-couldnt-fetch-in-google-search-console/#:~:text=The%20%E2%80%9CSitemap%20could%20not%20be,the%20sitemap%20is%20not%20indexed

ghost avatar Oct 30 '23 17:10 ghost

I added a trailing slash to my sitemap and it started to work. Both links are loading fine on the browser.

/sitemap.xml/

And Google managed to pick it up.

image

https://ruchern.xyz/sitemap.xml/

ruchernchong avatar Nov 12 '23 13:11 ruchernchong

In case it helps more people, @ruchernchong 's suggestion above worked and that basically confirms Google caches the failed sitemap.xml unless you change the path when resubmitting it. I confirmed Cloudflare was blocking the Bingbot and Googlebot on their basic plan, so had to turn off the Bot fighting mode and resubmit the sitemap.xml with the trailing slash to get it to work. What shocks me is that this is still a bug and no team at Google has bothered to fix it yet. It has to be hurting their search coverage.

ackshaey avatar Jan 07 '24 12:01 ackshaey