next-sitemap
next-sitemap copied to clipboard
Add support for generating sitemaps for multiple domains
I have several Next.js websites with different domains for different markets. Some domains have several locales handled with prefixes, some don't. As far as I can tell there is no way to generate sitemaps for each domain with next-sitemap
.
I would like a way to generate sitemaps for each domain with alternate URL's both for other domains and locale prefixes on the same or other domains. Something like:
Sitemap for example.com:
<url>
<loc>https://www.example.com/path/to/page</loc>
<xhtml:link rel="alternate" hreflang="en" href="https://www.example.com/path/to/page" />
<xhtml:link rel="alternate" hreflang="de-de" href="https://www.example.de/path/to/page" />
<xhtml:link rel="alternate" hreflang="nl-be" href="https://www.example.be/path/to/page" />
<xhtml:link rel="alternate" hreflang="fr-be" href="https://www.example.be/fr/path/to/page" />
</url>
Sitemap for example.de:
<url>
<loc>https://www.example.de/path/to/page</loc>
<xhtml:link rel="alternate" hreflang="de-de" href="https://www.example.de/path/to/page" />
<xhtml:link rel="alternate" hreflang="en" href="https://www.example.com/path/to/page" />
<xhtml:link rel="alternate" hreflang="nl-be" href="https://www.example.be/path/to/page" />
<xhtml:link rel="alternate" hreflang="fr-be" href="https://www.example.be/fr/path/to/page" />
</url>
It would have to combined with middleware or rerwites outside of Next.js I assume.
Today I generate sitemaps with my own code with filenames with locale suffixes like sitemap.en.xml
, sitemap.de-de.xml
, etc. I use middleware to rewrite requests to /sitemap.xml
to the correct locale.
I would love to be able to refactor out a lot of my custom code and just use a tool that works the same on all websites. If next-sitemap
easily could be configured to generate sitemaps for all domains I would love it. As it is now I can't use it but I would love to.
Caveat: This might be possible today, just that I don't understand the correct configuration for it.
Today I generate sitemaps with my own code with filenames with locale suffixes like
sitemap.en.xml
,sitemap.de-de.xml
, etc. I use middleware to rewrite requests to/sitemap.xml
to the correct locale.
Can you share your solutions about that?
Today I generate sitemaps with my own code with filenames with locale suffixes like
sitemap.en.xml
,sitemap.de-de.xml
, etc. I use middleware to rewrite requests to/sitemap.xml
to the correct locale.Can you share your solutions about that?
I've omitted the site specific code. This file is located in the root of the site:
import path from "path";
import fs from "fs";
export async function createSitemap(locale: string) {
// Method that get's domain name for locale (e.g. 'en' => 'example.com')
const domain = getDomain(locale);
// Method that generates sitemap
const sitemap = await generateSiteMap(locale);
// Create sitemap file
const sitemapFilename = `sitemap-${domain}.xml`;
const sitemapFile = path.resolve(`./public/${sitemapFilename}`);
await fs.promises.writeFile(sitemapFile, sitemap);
// Crete robots file
const robots = [
"User-agent: *",
"Allow: /",
"",
`sitemap: https://${domain}/sitemap.xml`,
].join("\n");
const robotsFilename = `robots-${domain}.txt`;
const robotsFile = path.resolve(`./public/${robotsFilename}`);
await fs.promises.writeFile(robotsFile, robots);
}
export async function createAllSitemaps(locales: Array<string>) {
try {
locales.forEach((locale) => createSitemap(locale));
} catch (error) {
console.error(error);
}
}
The above code is triggered by an api-route pinged with a cron job.
Middleware that handles requests and rewrites to the correct robots and sitemap files:
import { NextRequest, NextResponse } from "next/server";
export async function middleware(request: NextRequest, response: NextResponse) {
// Get necessary info from request
const pathname = request.nextUrl.pathname;
const domain = getDomain(request.nextUrl.locale);
// Handle robots
// Regex to match /robots.txt and /robots.[domain].txt
const robotsRegex = /^\/robots(-(\w([\w-]*\w)?\.)+[a-z]{2,})?\.txt$/;
const isRobots = robotsRegex.test(pathname);
// If trying to access localized robots.txt, redirect
if (
(isRobots && pathname !== '/robots.txt') ||
pathname === '/robots.non-prod.txt'
) {
return NextResponse.redirect(new URL('/robots.txt', request.nextUrl));
}
// Rewrite to non-prod robots.txt if in development or staging
if (
isRobots &&
(!process.env['SITE_ENVIRONMENT'] ||
process.env['SITE_ENVIRONMENT'] !== 'production')
) {
return NextResponse.rewrite(
new URL(`/robots.non-prod.txt`, request.nextUrl)
);
}
// Rewrite /robots.txt to /robots.[domain].txt
if (isRobots) {
return NextResponse.rewrite(
new URL(`/robots-${domain}.txt`, request.nextUrl)
);
}
// Handle sitemaps
// Regex to match /sitemap.xml and /sitemap.[domain].xml
const sitemapRegex = /^\/sitemap(-(\w([\w-]*\w)?\.)+[a-z]{2,})?\.xml$/;
// If trying to access localized sitemap, redirect
if (sitemapRegex.test(pathname) && pathname !== '/sitemap.xml') {
return NextResponse.redirect(new URL('/sitemap.xml', request.nextUrl));
}
// Rewrite /sitemap.xml to /sitemap.[domain].xml
if (sitemapRegex.test(pathname)) {
return NextResponse.rewrite(
new URL(`/sitemap-${domain}.xml`, request.nextUrl)
);
}
// Continue if nothing matches
return NextResponse.next();
}
export const config = {
matcher: [
"/sitemap.xml",
"/robots.txt",
"/robots.non-prod.txt",
"/sitemap-:domain.xml",
"/robots-:domain.txt",
],
};
Again, this code is a bit simplified and anonymized to not share client details.
On some less complex sites we just use a sitemap.xml.ts
file in pages and generate the domain-specific sitemap on the fly.
Closing this issue due to inactivity.