readthedocs.org
readthedocs.org copied to clipboard
Redirects: simplify redirect logic / stop using `resolve_path`
Currently, we try to match all redirects in all requests, but this is not always necessary, given that some redirects are valid just in some contexts.
We are also making use of the resolver to build the final redirect, but this is not always necessary, in most cases we can just add or replace components of the original URL. Calling the resolver makes several queries to the DB, and requires having each component of the URL parsed.
Prefix redirects
Used when migrating from another site to RTD, we redirect all the URLs under a given prefix to the default version of the project.
This redirect is valid only when we fail to find a version,
since if the user is already in /en/latest/
it doesn't make sense to redirect.
To generate the final URL, we need to use the resolver, using the default version/language.
We currently use this redirect even if we find a version,
for example, a /foo/
prefix redirect will redirect:
- /foo/index.html -> /en/latest/index.html
- /en/latest/foo/index.html -> /en/latest/index.html (this one is wrong!)
Page redirects
Used when a page is moved or deleted.
This redirect is valid only when we are able to find a version, since if we don't find a version we will redirect to another 404.
To generate the final URL, we don't need to use the resolver, we can just replace the path of the original URL.
We currently use this redirect even if we fail to find a version,
for example a /foo.html -> /bar.html
page redirect will redirect:
- /en/latest/foo.html -> /en/latest/bar.html
- /en/not-found/foo.html -> /en/not-found/bar.html (this one will 404!)
That second case may look like expected behavior, but it isn't (or at least isn't useful), since the final URL will 404. If a user deleted a whole version, they should use an exact redirect instead, for example:
-
/foo.html -> /bar.html
(page redirect) -
/en/not-found/$rest
->/en/latest/
(exact redirect)
This way, /en/not-found/foo.html
will redirect to /en/latest/foo.html
,
and that will redirect to /en/latest/bar.html
.
Exact redirects
Used to redirect a whole path to another path. This redirect is valid in all cases.
To generate the final URL, we don't need to use the resolver,
we just use the to URL
and replace the $rest
part if it exists.
HTML and HTML Dir redirects
Used when a project has changed from using .html
URLs to dir (/
) URLs or vice versa.
We could restrict this redirect to only apply when we are able to find a version, since if we don't find a version we will redirect to another 404, but it shouldn't be a problem to apply it in all cases.
To generate the final URL, we don't need to use the resolver,
we just replace the extension of the original URL,
this is foo.html
to foo/
, foo/
to foo.html
, and foo/index.html
to foo.html
.
Note: this redirects are named "Sphinx redirects", but they apply to all tools, not just Sphinx.
Forced redirects
All redirects need to be checked when using forced redirects. An exception can be made for prefix redirects, since they will only make sense for single version projects (versioned projects will 404 if an unknown path is given).
Changes
The easiest change will be to stop using the resolver for page and html/html dir redirects. The other change requires changing the queryset, to filter by the valid redirects for the given request, this may reduce the complexity of the final query resulting in a faster query (haven't tested this, so this is just an assumption), but if a project has lots of redirects, we may see a small improvement.
@stsewd was this achieved in the latest redirects refactor?
@stsewd was this achieved in the latest redirects refactor?
Nope