WPCOM-Legacy-Redirector
WPCOM-Legacy-Redirector copied to clipboard
Add verify-redirects command
Resolves #13
This command verifies the redirects are working properly, and marks them as published (to denote they have been verified).
What we're checking for (in plain english)
-
If redirecting to a URL:
- Does the URL pass
wp_validate_redirect()
, which checks for a valid url and an allowed host to redirect to.
- Does the URL pass
-
If redirecting to a post id:
- Does the post we're redirecting to exist?
- Is the post we're redirecting to published?
- Are we redirecting to a post type that is not publicly accessible?
-
If we haven't failed validation yet:
-
Test the redirect
-
Is the redirect a 300 level status code?
- If the redirect matches our destination URL, then mark the redirect as published (verified)
- Did we redirect to the same page, but with a trailing slash?
- If we reach this point, the redirect failed.
-
Is the redirect a 200 level status code?
- If we reach this point, the redirect never ran. The from URL must not be a 404.
-
Is it a different status code? (not 200 or 300 level)
- Something bad happened, report the status code.
-
Performance features & notes
- We're querying for 500 redirect entries at a time, sleeping for 1 second every 100 rows processed.
- The main query uses a left join on the same table to capture post parent info that allows us to run all the checks in section 2 without calling
get_post()
. This allows us to fail early if there's a problem before we need to callget_post()
(the most expensive query we're using). - The query fetches rows ordered by post_parent, so multiple redirects to the same post are verified one after another. I've added code that will reuse the post object in scenarios like this, cutting down on unnecessary calls.
The most expensive verification will be redirects to post ids that pass verification and need to be marked as published (verified). Failure notices should be optimized to happen as quickly and efficiently as possible.
Testing Instructions
- Setup a blank site on your localhost of choice
- run the following commands in terminal:
-
wp plugin install wordpress-importer --activate
-
wp import --prompt
- download a copy of the file from https://wpcom-themes.svn.automattic.com/demo/theme-unit-test-data.xml
- --author=create
-
- Using the following codeblock, make a csv file and import it into the site:
/redirect-to-non-url,this-isnt-a-url
/redirect-to-a-bad-url,/lot$-0f-+_)(*&^%$#!—Junk
/redirect-to-a-non-whitelisted-host,https://google.com
/redirect-to-autodraft,3
/redirect-to-scheduled-post,1153
/redirect-to-attachment,1692
/redirect-to-published-post,993
/redirect-to-published-page,735
notaurl-redirecting-topublishedpage/735
/lot$-0f-+_)(*&^%$#!—Junk/735
/post-id-that-doesnt-exist/20000
/sample-page/,/anywhere
/sample-page,this-should-redirect-to-include-trailing-slash
- in terminal, run
wp wpcom-legacy-redirector verify-redirects --format=table