wordpress-playground
wordpress-playground copied to clipboard
[Data Liberation] Tracking issue
Let's use this issue to track Data Liberation: Let's Build WordPress-first Data Migration Tools
Technical plumbing
- [x] https://github.com/WordPress/wordpress-playground/pull/1893
- [x] https://github.com/WordPress/wordpress-playground/pull/1952
- [x] https://github.com/WordPress/wordpress-playground/pull/1967
- [x] https://github.com/WordPress/blueprints-library/pull/116
- [ ] https://github.com/WordPress/wordpress-playground/pull/1968
- [ ] Extension points for plugin-provided URL treatment, e.g. base64_decode specific block attributes before rewriting the URLs
- [ ] Identifying each post's dependency graph to frontload the dependent data first
- [ ] Frontloading media files (fetching them before inserting the
wp_post
where they're used) - [ ] Dependency management – should we ship all the PHP classes in this repo? Or publish independent plugins for others to start adapting in their work – but with no BC guarantees?
- [ ] Streaming WXR import
- [ ] Streaming SQL import and export
- [ ] Streaming ZIP import and export
- [ ] Per-row version control (like @dmsnell's vector clock idea from https://core.trac.wordpress.org/ticket/60375)
- [ ] A conflict resolution mechanism with filters for plugin authors. Perhaps we won't need one, though.
- [ ] ... More TBD ...
Preliminary roadmap by use-case
- [ ] WXR preprocessor
- Port XML streaming logic from https://github.com/adamziel/wxr-normalize/
- Evaluate URL detection via https://github.com/WordPress/wordpress-develop/pull/7450
- Preprocess all WXR files before importing them to Playground to... * Rewrite the content URLs
- Pre-fetch media files
- Run this before importing WXR files into Playground to start collecting feedback
- [ ] Static block markup editor
- Build a simple plugin to import and export .html files representing specific WordPress pages from GitHub.
- Ship a Blueprint that loads Playground Docs into Playground
- We need to have a real use-case for interacting with data liberation on a daily basis and this is one. It's a super low-friction way of maintaining the Playground documentation and WordPress-on-GitHub-pages in general. (cc @bph @akirk)
- [ ] Reliable Playground ZIP export / import
- Fork the Sandbox Site plugin
- Improve the SQL export to make it streamable and ensure there are absolutely no issues with escaping
- Rewrite the exported and imported site URLs
- Include extension points to enable custom treatment of any block attribute, database row etc. See one of the GitHub discussions referenced in #1888
- Consider shipping
.sql
files with the export to potentially enable importing the resulting.zip
in a regular MySQL-based server environment - ...anything else actually?
- [ ] "Duplicate Playground" feature
- Iteration 1: Pipe the ZIP export to ZIP import
- Iteration 2: Mount
/wordpress-new
in the duplicated Playground instance, run the PHP export/import code to migrate the site from/wordpress
there - Iteration 3: Keep track of progress, make it resumable regardless of when the process is interrupted. This would enable exporting really big sites
- [ ] Direct WordPress <-> WordPress transfer
- Conceptually, this is like running Duplicate Playground over the internet
- Important to keep track of progress and resources versions using a vector clock
- Export / Import UI with scope (users? posts? etc.), error info (image.jpg couldn't be fetched after 3 retries), and error resolution mechanism (specify a different url? upload that image? retry 4th time?)
- [ ] Live WordPress <-> WordPress data sync
- Run the WordPress <-> WordPress transfer in a continuous way.
- This is not about collaborative editing in the block editor, although there is likely an overlap around data synchronization.
Here's a few more use-cases we'll likely tackle along the way, but they're not key milestones on their own:
- [ ] WXR importer
- Fork https://github.com/humanmade/WordPress-Importer
- Give attribution to the original team, ping them and start a conversation
- Port it to WP_XML_Tag_Processor
- Start using that fork for importing WXR files in Playground
- Rewrite the imported site URLs
- Use AsyncHTTP\Client for fetching assets
- Make it resumable if it fails halfway through
- Publish it as a standalone plugin to start gathering feedback and bug reports
- Include extension points to enable custom treatment of any block attribute, database row etc. See one of the GitHub discussions referenced in #1893
- [ ] Markdown exporter / importer for editing existing documentation sites from GitHub
5. Discuss using it for editing Playground docs, Gutenberg docs, and potentially all WordPress docs
6. Discuss using it as a drop-in static site generator replacement (e.g. Jekyll)
- Adapt the exhaustive MySQL parser explored by @janjakes to parse markdown in PHP. It should only require swapping the grammar.
- Migrate @dmsnell's Markdown <-> Block markup TypeScript converter from https://github.com/dmsnell/blocky-formats to PHP