wordpress-seo icon indicating copy to clipboard operation
wordpress-seo copied to clipboard

Add image schema to schema data

Open KiOui opened this issue 2 years ago • 4 comments

Context

  • We want to add image schema data to the schema we are currently rendering.

Summary

This PR can be summarized in the following changelog entry:

  • Add a renderer for rendering schema data.

Relevant technical choices:

  • The featured image was previously rendered with a custom tag. This now uses a normal schema id.
  • Refactored WPSEO_Sitemap_Image_Parser such that we are able to use the image parsing code in other places as well (please do check this thoroughly).

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

Check whether image schema data works for posts without images:

  • Go to the post editor and create a new empty post.
  • Save the post and go to the public (non-editor) view of the post you just created.
  • Go to the HTML source of the post and check that there is no primaryImageOfPage entry in the WebPage object. Also check that there is no image entry in the WebPage object and that there are no ImageObject objects within the @graph list.

Check whether image schema data works for a post with only a primary image:

  • Go to the post editor and create a new empty post. Set a primary image for the post.
  • Save the post and go to the public (non-editor) view of the post you just created.
  • Go to the HTML source of the post and check that there is exactly one ImageObject object within the @graph list (the ID for the image should be the URL). Also check that the primaryImageOfPage entry in the WebPage object is set to the same ID and that the image entry in the WebPage object also includes the same ID as the only image.

Check whether post content images are listed within the schema data:

  • Go to the post editor and create a new empty post (without a featured image). Add a couple of images to the post. Also add some duplicate images (images that are the same).
  • Save the post and go to the public (non-editor) view of the post you just created.
  • Go to the HTML source of the post and check that there is exactly one ImageObject object for each unique image added to the post in the @graph list (the ID for the image should be the URL). Also check that the primaryImageOfPage entry in the WebPage object is not set and that the image entry in the WebPage object also includes the ID's of the images that are within the post content.

Check whether post content images (external) are listed within the schema data:

  • Go to the post editor and create a new empty post. Add an external image to that post via the URL (such as https://helpx.adobe.com/content/dam/help/en/photoshop/using/convert-color-image-black-white/jcr_content/main-pars/before_and_after/image-after/Landscape-BW.jpg).
  • Save the post and go to the public (non-editor) view of the post you just created.
  • Go to the HTML source of the post and check that there is an ImageObject object for the external image added to the post in the @graph list (the ID for the image should be the URL). Also check that the image entry in the WebPage object includes the ID of the externally added image.

Note that adding images in Elementor will not work as these are not saved in the post content.

Relevant test scenarios

  • [ ] Changes should be tested with the browser console open
  • [x] Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • [x] Changes should be tested on different editors (Block/Classic/Elementor/other)
  • [ ] Changes should be tested on different browsers
  • [ ] Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • [x] QA should use the same steps as above.

QA can test this PR by following these steps:

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

  • This PR affects the helper that generates the image data for the sitemap files.

UI changes

  • [ ] This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • [ ] This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • [ ] I have written documentation for this change.

Quality assurance

  • [x] I have tested this code to the best of my abilities
  • [ ] I have added unit tests to verify the code works as intended
  • [ ] If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • [ ] I have written this PR in accordance with my team's definition of done.

Fixes PC-19 Fixes PC-20

KiOui avatar Aug 30 '22 15:08 KiOui

Things still left to do:

  • [x] Test the speed of the current implementation.
  • [x] Verify whether the decoupledness of the current implementation should be changed (e.g. there are somewhat the same functions in Webpage and Images.
  • [x] Fix the non-working sitemap class due to dependency injection.
  • [x] Add tests

KiOui avatar Aug 30 '22 15:08 KiOui

I also saw some small things that still need to be changed. In some parts of the schema we are still using the #primaryimage as the ID (such as in the Article -> image object). A quick search for #primaryimage and replacing it with the image URL should work.

KiOui avatar Sep 20 '22 07:09 KiOui

Also if we want to change the @id of every image to the URL of that image, we should take a look at all the schema that outputs an image as the Person and Organisation are using a custom ID as well.

KiOui avatar Sep 20 '22 07:09 KiOui

Things that we need to check/that are left to do:

  • [x] Public function add_image in webpage.php has been changed to a protected function add_primary_image. We might want to add a method that replaces add_image such that this method still works.
  • [x] We have to check whether the const PRIMARY_IMAGE_HASH in schema-ids.php and class-schema-ids.php are used in any of Yoast’s plugins or third-party plugins as the variable is public.
    • The LSX Tour Operator plugin (approximately 60 installs) uses the HASH in the following function this specific hash instance is depreacted since 14.0 and now deleted:
          /**
	 * Adds an image node if the post has a featured image.
	 *
	 * @param array                $data         The Review data.
	 * @param WPSEO_Schema_Context $context      The post ID of the current Place to add.
	 *
	 * @return array $data The Review data.
	 */
	public static function add_image( $data, $context ) {
		if ( $context->has_image ) {
			$data['image'] = array(
				'@id' => $context->canonical . \WPSEO_Schema_IDs::PRIMARY_IMAGE_HASH,
			);
		}
		return $data;
	}
  • [x] We have to check the following plugins for their schema generation:
    • [x] Premium
      • The templates inside src/schema-templates are not used yet and will thus not have any impact on this PR.
    • [x] WPSEO WooCommerce
      • The product image uses the #primaryimage tag, this PR fixes that.
    • [x] WPSEO Video
      • Is not using anything that was adjusted in Free, the plugin is using the VIDEO_HASH for the video identifier.
    • [x] WPSEO Local
      • Is not using anything that was adjusted in Free, the plugin is using other hashes defined in the plugin itself for several identifiers.
    • [x] WPSEO News
      • Is not using anything that was adjusted in Free, the plugin is using the ORGANISATION_HASH for the copyright holder identifier.
  • [x] We changed the Main_Image class and added all other images within the Images class in the schema as well. We should check whether this causes any issues.
                  /**
     * Remove the Yoast SEO schema block that carries the main image
     *
     * @param array $pieces
     *
     * @return array
     */
    private function remove_mainimage_schema_block( $pieces ) {
    	foreach ( $pieces as $key => $piece ) {
    		if ( $piece instanceof \Yoast\WP\SEO\Generators\Schema\Main_Image ) {
    			unset( $pieces[ $key ] );
    			break;
    		}
    	}
    
    	return $pieces;
    }
    
  • [x] ID's for the Person and Organisation should also be adjusted to fit the new schema. -> This will be done in a later issue.
  • [x] Test instructions should be adjusted.

KiOui avatar Sep 21 '22 13:09 KiOui

The class WPSEO_Content_Images in /inc/class-wpseo-content-images.php also does some image parsing that should be refactored to the Image_Helper at some point.

KiOui avatar Sep 27 '22 11:09 KiOui

CR: ✅

pls78 avatar Sep 30 '22 06:09 pls78

There was a

PHP Notice:  Trying to get property 'post_content' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/context/meta-tags-context.php on line 228

notice showing on the edit pages of posts, which I took care with this commit.

leonidasmi avatar Sep 30 '22 09:09 leonidasmi

Acceptance Testing: ❌

I found 2 issues with having resized images in the posts:

Post with resized image in the content and no other images

  • Have a post with no feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the frontend, there's an @id in the image attribute of the webpage that doesnt reference any imageobject in the schema and that's the id of the resized image. Example of such a schema, here.

Post with resized image in the content and with a feature image

  • Have a post with a feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the schema of the post, there's no mention of that resized image, just a mention of its full-sized counter-part. Example of such a schema, here.

leonidasmi avatar Sep 30 '22 12:09 leonidasmi

Acceptance Testing: ❌

I found 2 issues with having resized images in the posts:

Post with resized image in the content and no other images

  • Have a post with no feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the frontend, there's an @id in the image attribute of the webpage that doesnt reference any imageobject in the schema and that's the id of the resized image. Example of such a schema, here.

Post with resized image in the content and with a feature image

  • Have a post with a feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the schema of the post, there's no mention of that resized image, just a mention of its full-sized counter-part. Example of such a schema, here.

This happened due to the Image_Helper not taking into account the size of the image and just getting the image URL with size full every time. It is now fixed by extracting the image size from the URL (which we can do because WordPress makes sure that images that are resized get -[width]x[height] appended to their file name) and passing the image size with the Image value to the generators.

The change above also fixes a bug in the open_graph_images being added with http to the graph instead of https. We already fixed this for the Images class but not yet for the WebPage class.

KiOui avatar Oct 04 '22 09:10 KiOui

I notice a small issue with this fix, described with these instructions:

  • Have a post with no feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the schema of that Imageobject, the width/height attributes are showing wrong values, 640&480 instead of 300&225:
{
	"@type": "ImageObject",
	"inLanguage": "en-US",
	"@id": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"url": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"contentUrl": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"width": 640,
	"height": 480
},

leonidasmi avatar Oct 05 '22 14:10 leonidasmi

I notice a small issue with this fix, described with these instructions:

  • Have a post with no feature image and add in the content of the post, an image
  • Click on the image and on the sidebar, select Thumbnail for the Image size
  • Save the post
  • In the schema of that Imageobject, the width/height attributes are showing wrong values, 640&480 instead of 300&225:
{
	"@type": "ImageObject",
	"inLanguage": "en-US",
	"@id": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"url": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"contentUrl": "http://basic.wordpress.test/wp-content/uploads/2022/10/WordPress7-2-300x225.jpg",
	"width": 640,
	"height": 480
},

I think we need to add functionality to add_image_size inside Image_Helper such that it takes the size passed to generate_from_attachment_id and returns the width and height if the type of size is array and otherwise requests the metadata of the image or requests the file size from WordPress.

KiOui avatar Oct 05 '22 14:10 KiOui

I created a subtask for the above 👍

leonidasmi avatar Oct 06 '22 08:10 leonidasmi

There's a fatal error when accessing the Sitemap:

Fatal error: Uncaught Error: Call to undefined method Yoast\WP\SEO\Helpers\Image_Helper::image_url()
in /var/www/html/wp-content/plugins/wordpress-seo/inc/sitemaps/class-sitemap-image-parser.php on line 166

and I'm gonna push a commit to fix it.

leonidasmi avatar Oct 06 '22 08:10 leonidasmi

CR'ed the fix from Leonidas for the sitemap 👍

thijsoo avatar Oct 06 '22 10:10 thijsoo

There's an issue with external images:

  • when they happen to not be the primaryImageOfPage
  • they have empty url and contentUrl attributes in their ImageObject node:
{
    "@type": "ImageObject",
    "inLanguage": "en-US",
    "@id": "https://helpx.adobe.com/content/dam/help/en/photoshop/using/convert-color-image-black-white/jcr_content/main-pars/before_and_after/image-after/Landscape-BW.jpg?size=large",
    "url": "",
    "contentUrl": ""
},

I'll create a subtask so that we can solve this in the feature branch

leonidasmi avatar Oct 07 '22 09:10 leonidasmi

Acceptance Test is 👍 (apart from the aforementioned, non-blocking issues)

leonidasmi avatar Oct 07 '22 10:10 leonidasmi