vike icon indicating copy to clipboard operation
vike copied to clipboard

Sitemap Generation

Open gryphonmyers opened this issue 3 years ago • 5 comments

Sitemaps are an extremely common need. Basically every application should have one for SEO purposes. Since this plugin is taking on responsibility for rendering out all the routes of an application, I feel like there should be a solution (or at least guidance) facilitating the generation of a sitemap file.

One possibility would be to simply treat a sitemap as any other page file. There may be issues around it showing up as a route alongside "real" application routes though, not sure. Sitemaps are also xml, generally placed at root as sitemap.xml, so I'm not sure if the prerendering logic could currently handle that.

Another option would be to automatically generate sitemaps for users based on their prerender routes. There would need to be some additional configurability in order to facilitate things like i18n, lastmod, etc. This also means users who didn't bother to fill in prerender hooks (perhaps they're just using straight SSR) won't get a sitemap. Otherwise though, it seems like a more user-friendly option.

Thoughts / ideas?

gryphonmyers avatar Apr 29 '21 07:04 gryphonmyers

Basically every application should have one for SEO purposes

How so? I thought that sitemaps are only useful to let crawlers know about unreachable/non-linked pages (or to make crawlers discover pages faster that are reachable only after going through a lot of links.)

If it requires only a minimal amount of changes to src/ I'd be fine with accomodating for it. We should be careful about feature creep.

brillout avatar Apr 29 '21 21:04 brillout

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

chrisvariety avatar Apr 30 '21 02:04 chrisvariety

How so? I thought that sitemaps are only useful to let crawlers know about unreachable/non-linked pages (or to make crawlers discover pages faster that are reachable only after going through a lot of links.)

It's best practice to always have one for medium to large sites, then point Google at it. Google may be able to crawl all your routes, but it's imperfect. In more complex i18n scenarios where you're performing server-side geo redirects it is 100% necessary in order to inform Google about pages it can't reach.

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

It is common to do it this way, yes. All I'm saying is since this project is an alternative to products like Next/Nuxt, we should have a solution in mind for this extremely common need. A separate package makes total sense.

Considering that I need this functionality, I'd be happy to work on a solution once we have consensus on what that solution should be.

gryphonmyers avatar Apr 30 '21 03:04 gryphonmyers

How about:

  1. Expose all page routes at contextProps.pageRoutes.

  2. Expose all page URLs at contextProps.pageUrls. (Dynamic routes are included when doing pre-rendering.)

  3. addPage() API.

// Npm package @vite-plugin-ssr/sitemap

import { addPage } from 'vite-plugin-ssr/api'

export { addSitemap }

function addSitemap() {
  addPage({
    '.page.js': require.resolve('./path/to/sitemap.page.js'),
    '.page.route.js': require.resolve('./path/to/sitemap.page.route.js'),
    '.page.server.js': require.resolve('./path/to/sitemap.page.server.js'),
    // Empty `sitemap.page.client.js` for zero browser-side JavaScript
    '.page.client.js': require.resolve('./path/to/sitemap.page.client.js'),
  })
}

// vite.config.js

import ssr from 'vite-plugin-ssr/plugin'
import { addSitemap } from '@vite-plugin-ssr/sitemap'

addSitemap()

module.exports = {
  plugins: [ssr()]
}

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

I aslo care about keeping core lean & simple. The addPage() API enables further automatic generation of Term of Services Page, /manifest.json, etc.

brillout avatar Apr 30 '21 07:04 brillout

For prerendering with sitemaps, I did the following workaround, using sitemap.js:

I created a prerender.ts file:

import { prerender } from 'vite-plugin-ssr/cli';
import { SitemapStream, streamToPromise } from 'sitemap';
import { Readable } from 'stream';
import { locationOrigin } from './env';
import fs from 'fs/promises';

// An array with your links
const urlList: string[] = [];
const stream = new SitemapStream({ hostname: locationOrigin });

prerender({ pageContextInit: { urlList } }).then(() => {
    // Return a promise that resolves with your XML string
    streamToPromise(Readable.from(urlList).pipe(stream)).then((data) => {
        fs.writeFile('./dist/client/sitemap.xml', data.toString());
    });
});

This passes down a urlList named variable that pages can populate during prerender with their URLs.

Inside _default.page.server.ts I do the following:

import { locationOrigin } from '@env';

async function render(pageContext: PageContextBuiltIn & PageContext) {
    if (pageContext.urlList && pageContext.url !== '/fake-404-url') {
        pageContext.urlList.push(`${locationOrigin}${pageContext.url}`);
    }
    // ...
}

When building, I use ts-node and this self-tailored prerender file: vite build && vite build --ssr && ts-node ./prerender.ts

I'm sure this could be made more elegant, and the logic could be refactored so both SSR and SSG uses the same sitemap generation, but it does its job for the time being.

truumahn avatar Jun 30 '22 11:06 truumahn

Automatic sitemap generation would be great and it's definitely on the radar to enable the ecosystem to build such extensions.

Actually, it may already be possible with https://vite-plugin-ssr.com/extends. (It will require to read private pageContext._* properties but we can make them public/stable.) Contributions much welcome to try.

Closing in the meantime as it's not a top priority for now. Also, it may be already possible (soon).

brillout avatar May 31 '23 20:05 brillout

@brillout Can you please elaborate on how this works using extends?

(Currently I'm generating a sitemap.xml by using the information which pages are prerendered to dist/client.)

schaschko avatar Jun 18 '23 10:06 schaschko

@schaschko Check the private pageContext._* properties, I believe they'll give you the information you need. Keep me updated: we can then turn the private properties public including proper documentation.

brillout avatar Jun 19 '23 06:06 brillout

Not sure I'm on the right path here. What I could come up with, using pageContext._allPageIds and the diff scaffolded from a react-ts app:

diff --git a/pages/about/index.page.server.tsx b/pages/about/index.page.server
.tsx
new file mode 100644
index 0000000..6b12acf
--- /dev/null
+++ b/pages/about/index.page.server.tsx
@@ -0,0 +1,50 @@
+import ReactDOMServer from "react-dom/server";
+import { PageShell } from "./PageShell";
+import { escapeInject, dangerouslySkipEscape } from "vite-plugin-ssr/server";
+import logoUrl from "./logo.svg";
+import type { PageContextServer } from "./types";
+
+export { Page };
+
+function Page(pageProps) {
+  console.log(pageProps);
+  return <>
+    Sitemap constructed from pageProps (failed)
+    {/* <?xml version="1.0" encoding="UTF-8"?> */}
+    {/* <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> */}
+    {/* <url> */}
+    {/* <loc>https://www.example.com/foo.html</loc> */}
+    {/* <lastmod>2022-06-04</lastmod> */}
+    {/* </url> */}
+    {/* </urlset> */}
+  </>;
+}
+
+export { onBeforeRender };
+
+async function onBeforeRender(pageContext) {
+  return {
+    pageContext: {
+      pageProps: pageContext._allPageIds,
+    },
+  };
+}
+
+export { render };
+
+async function render(pageContext: PageContextServer) {
+  const { Page, pageProps } = pageContext;
+  // This render() hook only supports SSR, see https://vite-plugin-ssr.com/re
nder-modes for how to modify render() to support SPA
+  if (!Page)
+    throw new Error("My render() hook expects pageContext.Page to be defined"
);
+  const pageHtml = ReactDOMServer.renderToString(<Page {...pageProps} />);
+
+  const documentHtml = escapeInject`${dangerouslySkipEscape(pageHtml)}`;
+
+  return {
+    documentHtml,
+    pageContext: {
+      // We can add some `pageContext` here, which is useful if we want to do
 page redirection https://vite-plugin-ssr.com/page-redirection
+    },
+  };
+}
diff --git a/pages/about/index.page.tsx b/pages/about/index.page.tsx
deleted file mode 100644
index 3cf7a11..0000000
--- a/pages/about/index.page.tsx
+++ /dev/null
@@ -1,14 +0,0 @@
-import './code.css'
-
-export { Page }
-
-function Page() {
-  return (
-    <>
-      <h1>About</h1>
-      <p>
-        Example of using <code>vite-plugin-ssr</code>.
-      </p>
-    </>
-  )
-}
diff --git a/vite.config.ts b/vite.config.ts
index f476887..f57b373 100644
--- a/vite.config.ts
+++ b/vite.config.ts
@@ -1,9 +1,15 @@
-import react from '@vitejs/plugin-react'
-import ssr from 'vite-plugin-ssr/plugin'
-import { UserConfig } from 'vite'
+import react from "@vitejs/plugin-react";
+import ssr from "vite-plugin-ssr/plugin";
+import { UserConfig } from "vite";

 const config: UserConfig = {
-  plugins: [react(), ssr()]
-}
+  plugins: [
+    react(),
+    ssr({
+      prerender: true,
+      includeAssetsImportedByServer: true,
+    }),
+  ],
+};

-export default config
+export default config;

I tried to somehow get a "blank" page, where I could inject the xml-sitemap code. The most minimal I got was, but that was no minimal enough:

<head><link rel="stylesheet" type="text/css" href="http://localhost:3000/assets/static/default.page.server.d4835ae9.css"></head>
Sitemap constructed from pageProps (failed)

I also tested truumahn solution, which worked, until I tried to await some stuff from db in onBeforeRender, then it stopped working..

schaschko avatar Jun 19 '23 19:06 schaschko

I'm not sure I understand your problem. But seems like a user land problem.

I'm realizing that this data: https://github.com/brillout/vite-plugin-ssr/blob/70ab60b502a685e39e65417a011c134fed1b5bd5/vite-plugin-ssr/shared/route/loadPageRoutes.ts#L14-L21 isn't accessible, I can make it available over pageContext._pageRoutes if you believe you need that.

brillout avatar Jun 20 '23 08:06 brillout

I can make it available over pageContext._pageRoutes if you believe you need that.

Done. You can now access all the internal routing information over pageContext._pageRoutes.

If many use it then I'll make it a stable public API.

brillout avatar Jun 20 '23 12:06 brillout

I noticed _pageRoutes doesn't include the pages from +onBeforePrerenderStart. Is there any way to make this work?

For example, it just shows

{
  pageId: '/pages/hello',
  comesFromV1PageConfig: true,
  routeFunction: [Function: route],
  routeDefinedAt: '/pages/hello/+route.ts',
  routeType: 'FUNCTION'
}

But not the routes for the individual hello pages from the demo

briansunter avatar Nov 21 '23 09:11 briansunter

@briansunter Since you provide the list of URLs, you already have that information: you don't need it from Vike. That said, alternatively, you can use pageContext._preerenderContext.pageContexts. Keep in mind that it's internal so make sure to pin Vike's version.

I'm curious: do other React/Vue/... frameworks provide features in that regard?

Also sponsoring welcome (you'll get a bump in feature request prioritization).

brillout avatar Nov 21 '23 10:11 brillout

Hey @brillout thats basically what I was thinking: merging the static pages from this property with the URLs I'm providing in pre-render.

Should be able to import the route functions from the different pages to build that list as well.

I don't think react has anything like this built in, I'm comparing it more to SSGs like eleventy that have knowledge of all static and dynamic routes.

Manually merging it will work, just wasn't sure if there was a better way, since I'm already providing those routes to vike.

That internal property looks exactly like what I need. Will try that out and post a reference implementation if anyone's interested.

And I'll definitely look into getting more involved after finishing my current project.

briansunter avatar Nov 21 '23 22:11 briansunter

@briansunter Thanks for circling back on this. Also make sure to check https://vike.dev/markdown#page-list.

brillout avatar Nov 21 '23 22:11 brillout