static-html-output icon indicating copy to clipboard operation
static-html-output copied to clipboard

Rewrite Rules not working

Open swtrse opened this issue 5 years ago • 9 comments

Hello,

I have these settings

  • WP site: www.dynpage.com
  • Static version: www.statpage.com

I have the need to rewrite a specific url in the static version of the wp site. I added this rewrite Rule to achieve this

https://www.dynpage.com/contact/,https://contact.dynpage.com/ I also tried in case the url rewrite for the static page happends before https://www.statpage.com/contact/,https://contact.dynpage.com/

Unfortunally the urls are never touched. How can I achieve this? There is a hint that "URLs will be first checked to ensure they are part of this site, not an external WordPress site, which would mess up URLs of linked images, etc" But does that include the new URL too? Is this maybe the reason and I have to postprocess all files with "sed" after the export myself?

swtrse avatar Jun 07 '20 01:06 swtrse

Hi @swtrse, I'm assuming this if for version 6.6.7?

Can you please share a before/after code sample from your site (paste here as a code block), you can obfuscate the parts that aren't needed/adjust the URLs, but I'd like to see the exact format of the original code.

The rewrite destination won't be getting checked during rewriting whether it's internal or not, so any string replacement should work.

How are you planning to host/do the actual content direct from the subdomain to the /contact/?

leonstafford avatar Jun 11 '20 16:06 leonstafford

Try keeping expected link in advanced at dev site, then start exporting.

thegulshankumar avatar Jun 14 '20 03:06 thegulshankumar

OK Thats easy since both version are online. You can look at https://www.aninite.at for the original version and https://2021.aninite.at for the Version created with wp2static. I deactivated my sed-workaround

The Link in question is the link to the contact page: https://www.aninite.at/kontakt->https://kontakt.aninite.at. I use wp2static to create an archive of the page. However there is the requirement that the kontact page link always points to the actual contact page.

I put the process into a script that runns daily `#!/bin/sh

event=aninite archive=2021

contactPage=https://$archive.$event.at/kontakt contactCurrentPage=https://kontakt.$event.at sourceHtmlFolder=/var/www/$event.at/html sourcePage=https://www.$event.at targetRootFolder=/var/www/$archive.$event.at targetHtmlFolder=$targetRootFolder/html targetPage=https://$archive.$event.at startDirectory=$PWD

echo SourceFolder: $sourceHtmlFolder echo SourcePage: $sourcePage echo TargetFolder: $targetHtmlFolder echo TargetPage: $targetPage echo ArchivedContactPage: $contactPage echo RewriteContactPage: $contactCurrentPage

if [ ! -d "$targetRootFolder" ] then echo Directory $targetRootFolder does not exist, it will be created. mkdir -p "$targetRootFolder" chown -R nginx:nginx "$targetRootFolder" fi

ulimit -Hn 500000 ulimit -Sn 500000

cd "$targetRootFolder"

/var/www/wp-cli/wp --allow-root wp2static options set targetFolder "$targetHtmlFolder" --color --url="$sourcePage" --path="$sourceHtmlFolder" /var/www/wp-cli/wp --allow-root wp2static options set baseUrl "$targetPage" --color --url="$sourcePage" --path="$sourceHtmlFolder" /var/www/wp-cli/wp --allow-root wp2static options set excludeURLs "/wp-admin/" --color --url="$sourcePage" --path="$sourceHtmlFolder" /var/www/wp-cli/wp --allow-root wp2static options set additionalUrls "/wp-content/plugins/wordpress-seo/css/main-sitemap.xsl" --color --url="$sourcePage" --path="$sourceHtmlF$ /var/www/wp-cli/wp --allow-root wp2static options set crawl_increment 5 --color --url="$sourcePage" --path="$sourceHtmlFolder" /var/www/wp-cli/wp --allow-root wp2static options set rewrite_rules "$contactPage,$contactCurrentPage" --color --url="$sourcePage" --path="$sourceHtmlFolder"

rm -Rf "$targetHtmlFolder"

/var/www/wp-cli/wp --allow-root wp2static generate --color --url="$sourcePage" --path="$sourceHtmlFolder" ulimit -Sn 1024 ulimit -Hn 4096 #sedCommand="s@"$contactPage([/]?)"@"$contactCurrentPage\1"@g" #sedCommand=${sedCommand//./\.} #find "$targetHtmlFolder" -type f -exec sed -r -i $sedCommand {} +

chown -R nginx:nginx "$targetHtmlFolder" cd "$startDirectory" `

swtrse avatar Jun 17 '20 07:06 swtrse

Try keeping expected link in advanced at dev

I have absolutley no idea what you want to say. If it is to use the expected link on the wordpress site itself than, this is not an option for all locations. This works with links I made in Posts, News or other content. But the Menu is autogenerated from the side structure. There would be the possibility to add an custom Menu to achieve this but then a new site would only be visible if added to the custom menu by hand. That is not what is intendet.

swtrse avatar Jun 17 '20 10:06 swtrse

There is a plugin called "Real Time Find and replace", might be helpful at Dev site if you want any on-the-fly changes.

thegulshankumar avatar Jun 17 '20 10:06 thegulshankumar

There is a plugin called "Real Time Find and replace", might be helpful at Dev site if you want any on-the-fly changes.

Well that woul work too. But I like to keep the plugins at a minimun on the wordpress side. Even if that Solution is totaly valid I prever my sed workaround over another wp plugin. And I would prefer it asolutley if wp2static could handle it.

swtrse avatar Jun 17 '20 12:06 swtrse

AFIK, wp2static offers rewriting for template path such as /wp-content/ to /content/ with intent of hiding of WordPress trace and keep bots away from unnecessary crawling.

@leonstafford Is there any filter to modify HTML output something as str replacement?

thegulshankumar avatar Jun 17 '20 12:06 thegulshankumar

@swtrse it's great to see your script there. Even though it's a workaround, when running via WP_CLI, it's nice to see the flexibility for such use cases. As the URL is on the same domain host, it is easiest to keep the rewriting at the end of the process, after everything else is done. If you don't want to do that in the script, you could use the statichtmloutput_post_deploy_trigger action (if you're using the zip deploy option, can still ignore zip and do what you want with the files). That could be added to your theme's functions.php or into a custom WP plugin you make. If there's a pattern to how the contact forms and URLs work across all your sites, a 1-file plugin with such a script might work for you. Else, what you're doing now looks pretty good.

If you grab the release 6.6.20 from this repo's Releases page, I'd think you can remove the ulimit commands as they should be mitigated by no longer reading/writing the files so much during exporting (not confirmed, may still be an issue just due to the amount of files being opened/written during crawl/post processing, maybe I need to close file handles explicitly - would be keen to hear).

leonstafford avatar Jun 17 '20 16:06 leonstafford

I stick with my script then :)

swtrse avatar Jun 20 '20 21:06 swtrse