apify-docs
apify-docs copied to clipboard
Readme metadata "slug" option should support multiple URLs, with automated redirects to the main URL
The documentation Markdown metadata has the slug parameter, which indicates under which URL the page is published in docs.
The problem is, when the slug is changed, the old slug URL breaks, unless you add it manually to the nginx configuration, which is messy, contains duplicate entries, and multiple redirections, so e.g. a link to https://docs.apify.com/scraping leads you to a totally irrelevant page https://docs.apify.com/platform/actors/running. This is hard to manage and causes more broken links: https://charts.apify.com/dashboards/147-documentation
For context, see https://apifier.slack.com/archives/CQ96RHG2U/p1701348240817199
The solution would be to extend the slug functionality, to enable adding multiple comma- or space-seperated URLs, such as
---
title: Getting started with Apify scrapers
menuTitle: Getting started
...
slug: /apify-scrapers/getting-started, /scraping/tutorial, /something-else
---
The docs engine would consider the first slug URL as the primary one, and the other URLs would 301 redirect to it. This way, the writers of docs can easily and safely ensure the URLs don't get broken over time - they would be required to always add to the slug, but don't remove URLs.
In case there's a conflict, e.g. multiple pages point to the same URL, the docs build process should throw an error.
Together with this task, we should also migrate the nginx config settings to this new feature.
slug: /apify-scrapers/getting-started, /scraping/tutorial, /something-else
I am not sure if this is possible, we would probably need to hack docusaurus internals which I really don't want to do. Also, the server-side redirects are handled outside of docusaurus - as that only generates SPA, there is no server involved, it's all static content deployed to GH pages. Docusaurus allows only client-side redirects on its own.
But I agree we need something like this, and I'd say having a new separate option for the redirects could work better (as we won't need to deal with changing anything in docusaurus itself). The current redirects via nginx are quite messy, we could generate them based on the frontmatter of the source pages to improve it.
Perhaps we can just have a script to parse the READMEs and generate the nginx config automatically? Completely outside of Docusaurus. It could looks like:
---
title: Getting started with Apify scrapers
menuTitle: Getting started
...
slug: /apify-scrapers/getting-started
redirectSlugs: /scraping/tutorial, /something-else
---
This is not about readmes, the sources can be spread across many files (inside multiple repositories). But yes, that's what I was thinking of, an additional option that would be ignored by docusaurus. We would still probably keep some manually written rules, but most of the existing ones (and more importantly the new ones) could be inferred from the source files, which is indeed easier to work with. Nowadays we have the nginx configs inside the main docs repository, so it should be all doable.
I wanted to have a look at #754 later today, which will require a script to go through all the source files anyway, so will think about handling this too (just the base for now, we will need to manually go through all the redirects I guess, but that can happen gradually).