wordpress-playground icon indicating copy to clipboard operation
wordpress-playground copied to clipboard

Rewrite URLs in imported WXR files to avoid broken navigation links (white screen, errors, nested Playground)

Open bph opened this issue 1 year ago • 13 comments

On this Playground site. I get intermittent success when using the navigation menu. One-page load works, subsequent page loads show a white screen. The content is all working when I got to WP-admin > Pages and use the View of each page. But sometimes on link works, but then the next one doesn't.

The content and blueprint can be viewed in this repo.

Here is a video of my clicking around on the site.

https://github.com/user-attachments/assets/07cc8bf7-fdb2-41e6-ad0b-a7e58476810a

bph avatar Sep 18 '24 12:09 bph

When I navigate around the site not using the links in the navigation block, the pages load consistently. Archive page, author page, single posts etc. when I click again on one of the links in the navigation bar, I get a white screen once more.

https://github.com/user-attachments/assets/bb08f670-f9d4-4258-8ec2-9bc94144e553

At the end of the video, you see me click on the Home link in the navigation, it actually loads another playground instance into the site.

Screenshot 2024-09-18 at 14 52 51

bph avatar Sep 18 '24 12:09 bph

I'm wondering if these two could be related #349

This seems like a caching bug to me. Playground is trying to load the page from cache while it should call PHP. Screenshot 2024-09-18 at 14 14 58

bgrgicak avatar Sep 18 '24 13:09 bgrgicak

Thank you @adamziel for setting me straight... Glad it was so easy to transfer.

bph avatar Sep 19 '24 13:09 bph

The default instance of playground uses a URL like https://playground.wordpress.net/scope:0.5198681762892301/?page_id=2 (TT4, Sample page in Header)

On my site it only has the URL https://playground.wordpress.net/about-us

Is there a way for me to modify the URL in the Navigation space of my .xml file from relative links <!-- wp:navigation-link {"label":"About Us","type":"page","description":"","id":28,"url":"/about-us/","kind":"post-type"} /--> To something like https://playground.wordpress.net/scope:{somestring}/about-us? The line of code is in the XML import, that I modified to remove the original site's absolute links to show only relative links.

bph avatar Sep 19 '24 13:09 bph

This is definitely related to scope.

After the first load, the page is /. When you click on a page like /patterns/ the referer (/) has scope so we avoid caching. When you click on /news/ the referer (/patterns/) doesn't have scope and it goes to cache.

I think that there is an underlying problem because / gets a scope when used as a referer, while /patterns/ doesn't.

bgrgicak avatar Sep 20 '24 07:09 bgrgicak

Is there a way for me to modify the URL in the Navigation space of my .xml file from relative links

To something like https://playground.wordpress.net/scope:{somestring}/about-us? The line of code is in the XML import, that I modified to remove the original site's absolute links to show only relative links.

Great research @bph! You are right about the root cause being imported URLs that aren't rewritten.

I'm not sure what's the best way to address this and will need to work with @adamziel and @brandonpayton on finding possible next steps.

bgrgicak avatar Sep 20 '24 08:09 bgrgicak

It looks like we are attempting to add the scope to the URL if it doesn't exist but that scope isn't used later by the browser or our code (I still don't know).

bgrgicak avatar Sep 20 '24 08:09 bgrgicak

I see a few directions here, but I'm not sure what to do.

  • Rewrite all URLs upfront. In this case, ensure WXR imported URLs have a scope.
  • Find a way to ingest scope into URLs after Playground loads.
  • Find another way to store and propagate scope instead of pretending it to the URL.

bgrgicak avatar Sep 20 '24 08:09 bgrgicak

I'm moving this to blocked until I get some feedback from @WordPress/playground-maintainers.

bgrgicak avatar Sep 20 '24 08:09 bgrgicak

@bgrgicak thank you so much for pushing this forward.

This is actually also a problem when migrating sites to other servers, as absolute links need to have a search/replace function. If Playground can do it out of the box, there wouldn't be a need for me to modify the original site export file for images and links. And a two section of my tutorial could be cut could be cut. 🤔

Seems you have enough information to tackle this. Just want to mention that this is not only a hick-up in relation to the navigation block but happens with normal on page links, to be visible on the Templates page. Those also don't work. the string of the link is

<li>a <a href="/page-no-title/" data-type="page" data-id="192">page  no title template</a> that allows for a Hero image or a Cover block directly on the top of the page. </li>
<!-- /wp:list-item -->

Screenshot 2024-09-20 at 11 39 49

bph avatar Sep 20 '24 09:09 bph

@bph A proper resolution will take a few months. Is there a way you could ship that block without an absolute URL in the href=""? Maybe a relative one would work? Or maybe the block could handle only having a page ID?

Longer answer:

The imported WXR file contains this code:

<!-- wp:list-item -->
<li>a <a href="/page-no-title/" data-type="page" data-id="192">page  no title template</a> that allows for a Hero image or a Cover block directly on the top of the page. </li>
<!-- /wp:list-item -->

Which is not rewritten by the WXR importer we're currently using. I'm not aware of a tool that we could use in Playground that would also could correctly handle that today. I'm planning to fork/build a WXR importer and bake in the URL rewriting using the plumbing we've been exploring for the past year [1] [2]. Once it matures, I'll want to propose it for WordPress core.

[1] https://github.com/adamziel/site-transfer-protocol [2] https://github.com/adamziel/wxr-normalize/pull/1

adamziel avatar Sep 23 '24 14:09 adamziel

@adamziel thanks for looking into this again.

I am a bit confused as to what you see as absolute link and relative link

Maybe a relative one would work? Isn't <a href="/page-no-title/" a relative link? An absolute link would be something lie https://wordpress67.local/page-no-title

bph avatar Sep 27 '24 15:09 bph

So for the header navigation, the examples of how the theme Twenty-Twenty-Four works out of the box got me thinking.

If I added all the pages and be deliberate with the page parent selection, the theme default navigation probably will work with the page list, create the submenus and some voodoo that is built into it. (voodoo = not entirely clear, how it works)

So with the v2 blueprint and v2 content, I was able to get this part working. )

https://github.com/user-attachments/assets/b369d468-a4ef-4b7f-b5e1-4728cfe2d438

In the video you can see that all link from the top navigation have a scope assigned and load pages from a virtual (or how you want to call it) directory. It works because I didn't create a custom navigation block. The automatism built into WordPress takes care of it. but it seems Playground already rewrites links and adds scope to the URLs.

Next steps: Before the next upload

  • Get the images references fixed in the * xml and
  • make the page links on the Templates page relative again.

bph avatar Sep 27 '24 15:09 bph

This needs one more iteration to rewrite relative URLs in the navigation link block attributes

adamziel avatar Sep 17 '25 16:09 adamziel

@bph everything should work here now! Feel free to reopen if you experience any more URL-related issues.

adamziel avatar Sep 19 '25 12:09 adamziel

It's better now, but it still has trouble with the links in the Navigation block. This is what I saw on import:

Image

This is what I actually expected:

Image

Here is the part of the *xml file for this navigation:

<content:encoded><![CDATA[<!-- wp:home-link /-->

<!-- wp:navigation-link {"label":"About Us","type":"page","id":28,"url":"http://localhost:8881/?page_id=28","kind":"post-type"} /-->

<!-- wp:navigation-link {"label":"News","type":"page","id":26,"url":"http://localhost:8881/?page_id=26","kind":"post-type"} /-->

<!-- wp:navigation-submenu {"label":"Templates","type":"page","id":63,"url":"http://localhost:8881/?page_id=63","kind":"post-type"} -->
<!-- wp:navigation-link {"label":"Single Page Layout","type":"page","id":65,"url":"http://localhost:8881/?page_id=65","kind":"post-type"} /-->

<!-- wp:navigation-link {"label":"Page no title","type":"page","id":192,"url":"http://localhost:8881/notfound","kind":"post-type"} /-->

<!-- wp:navigation-link {"label":"404 Page - not found","url":"/apage","kind":"custom"} /-->
<!-- /wp:navigation-submenu -->

<!-- wp:navigation-submenu {"label":"Patterns","type":"page","id":32,"url":"http://localhost:8881/?page_id=32","kind":"post-type"} -->
<!-- wp:navigation-link {"label":"Blocks","type":"page","id":52,"url":"http://localhost:8881/?page_id=52","kind":"post-type"} /-->

<!-- wp:navigation-link {"label":"Page Patterns","type":"page","id":183,"url":"http://localhost:8881/?page_id=183","kind":"post-type"} /-->
<!-- /wp:navigation-submenu -->]]></content:encoded>

bph avatar Oct 02 '25 10:10 bph

Just so you know the Nav block will soon have dynamically resolved URLs for links that go to internal entities (Pages, Posts...etc). This could make this whole thing a lot easier.

getdave avatar Oct 10 '25 08:10 getdave

Thank you for reporting @bph! It was a subtle issue in block markup rewriter: https://github.com/WordPress/php-toolkit/pull/191 I'm backporting it to the importer plugin now

adamziel avatar Oct 10 '25 16:10 adamziel

Super cool @adamziel Thank you so much for digging in and fixing things!

bph avatar Oct 10 '25 17:10 bph

WordPress importer 0.9.4 is now released and your file is imported correctly!

https://playground.wordpress.net/?import-wxr=https%3A%2F%2Fraw.githubusercontent.com%2Fwptrainingteam%2Ftt5-demo-blueprint%2F911e2b2cfc2f3d63bd7c219c5159d4d758647cf6%2Fplayground-content.xml

adamziel avatar Oct 10 '25 22:10 adamziel