wordpress-importer icon indicating copy to clipboard operation
wordpress-importer copied to clipboard

remapped image urls in block attributes will not correctly replaced

Open lgersman opened this issue 4 years ago • 4 comments

Hi guys,

if you import a page/post with - let's say a wp:cover block - the image attribute of the block will not be replaced since block attributes are json escaped.

An example:

<!-- wp:cover {"url":"http:\/\/example.org\/wp-content\/uploads\/2021\/06\/zdf-hitparade.jpg","id":5,} -->
<div class="wp-block-cover has-background-dim">
  <img class="wp-block-cover__image-background wp-image-5" alt="" src="http://example.org/wp-content/uploads/2021/06/zdf-hitparade.jpg" data-object-fit="cover"/>
  <div class="wp-block-cover__inner-container">
    <!-- wp:paragraph {"align":"center","placeholder":"Write title\u2026","textColor":"white","className":"","fontSize":"large"} -->
      <p class="has-text-align-center has-white-color has-text-color has-large-font-size">WOW</p>
    <!--/wp:paragraph -->
  </div>
</div>
<!-- /wp:cover -->

In case attachment import was enabled AND an image zdf-hitparade.jpg already exists locally, the importer will create a new attachment for the image and will rename/store the image attachment to zdf-hitparade-1.jpg.

At the end of the attachment import the posts will be processed to replace the old reference to the image with new image url (http://example.org/wp-content/uploads/2021/06/zdf-hitparade.jpg => http://example.org/wp-content/uploads/2021/06/zdf-hitparade-1.jpg).

this works fine for the <img> element, but not for the wp:cover url attribute since its json escaped. To fix this you simple change the code at https://github.com/WordPress/wordpress-importer/blob/e05f678835c60030ca23c9a186f50999e198a360/src/class-wp-import.php#L1271 from

$wpdb->query( $wpdb->prepare( "UPDATE {$wpdb->posts} SET post_content = REPLACE(post_content, %s, %s)", $from_url, $to_url ) );

to

$wpdb->query($wpdb->prepare("UPDATE {$wpdb->posts} SET post_content = REPLACE( REPLACE(post_content, %s, %s), %s, %s)", $from_url, $to_url, json_encode($from_url), json_encode($to_url)));

and everything works like a charm.

Kind regards and have a nice weekend,

Lars

lgersman avatar Jun 18 '21 13:06 lgersman

With that change, wouldn't any existing plain URLs be unchanged?

joyously avatar Jun 18 '21 13:06 joyously

Both plain and escaped variant will be changed by the suggested fix.

As you might have seen in the suggested change, the new $to_url and it's escaped variant will be replaced.

Without that change, NO page/post containing a Gutenberg block with a image url attribute will display wrong data in Gutenberg (since the block attributes will not be changed with the current replacement code) after being imported.

lgersman avatar Jun 18 '21 17:06 lgersman

So, classic HTML (or legacy HTML), of which there are at least 15 years worth, would not be replaced correctly?

joyously avatar Jun 18 '21 17:06 joyously

That's not what I am said.

The current implemention is anyway kinda brute force and will replace ALL old image refs with the new ones - not only for the freshly imported pages/posts. So it possibly breaks existing content anyway.

What I reported is that the current implementation does not result in clean normalized pages/posts for content containing wp:cover and friends.

It might be possible that my proposed fix may break 15 year old content under some circumstances. On the other side : the fix would make the import compatible with Gutenburg. Decide yourself :-)

lgersman avatar Jun 18 '21 19:06 lgersman