3.17 - Frontend process
Description
Implement a new Frontend Controller for WP Rocket to handle the lazy-render-content feature. This controller will manage the lazy-rendering of content, leveraging the design and file structure similar to the existing LCP (Largest Contentful Paint) controller, but tailored for lazy-rendering content. We will implement three solutions (DOM, Regex, SimpleHtmlDom) to compare their results.
Scope a solution
-
Define the Frontend Controller:
- Create a new controller in the
WP_Rocket\Engine\Optimization\LazyRenderContent\Frontendnamespace. - This controller will include methods to apply lazy-rendering optimizations to the content.
- Create a new controller in the
-
Implement the Controller Class:
-
First we create the compulsory method by the controller interface
optimize, we can keep it empty for as we will attend to it here later- Create a method maybe
create_hashthat modifies the buffer and adds hashes to eligible elements. - Add a conditional check, if LRC data does not exist in DB, then add the hashes
- Utilize the methods from the prototype PR for the new frontend controller:
Controller: Handles HTML processing using three methods: DOMDocument, Regex, and SimpleHtmlDom.
<?php declare(strict_types=1); namespace WP_Rocket\Engine\Optimization\LazyRenderContent\Frontend; use DOMDocument; use voku\helper\HtmlDomParser; use voku\helper\SimpleHtmlDomBlank; class Controller { public function add_locations_hash_to_html_dom( $html ) { $dom = new DOMDocument(); @$dom->loadHTML( $html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD ); $body = $dom->getElementsByTagName( 'body' )->item( 0 ); if ( ! $body ) { return $html; } $this->add_hash_to_element_dom( $body, 2 ); return $dom->saveHTML(); } private function add_hash_to_element_dom( $element, $depth ) { if ( $depth < 0 ) { return; } $skip_tags = [ 'DIV', 'MAIN', 'FOOTER', 'SECTION', 'ARTICLE', 'HEADER', ]; static $count = 0; foreach ( $element->childNodes as $child ) { if ( XML_ELEMENT_NODE !== $child->nodeType || ! in_array( strtoupper( $child->tagName ), $skip_tags, true ) ) { continue; } $child_html = $child->ownerDocument->saveHTML( $child ); $opening_tag_html = strstr( $child_html, '>', true ) . '>'; $hash = md5( $opening_tag_html . $count ); ++$count; $child->setAttribute( 'data-rocket-location-hash', $hash ); $this->add_hash_to_element_dom( $child, $depth - 1 ); } } public function add_locations_hash_to_html_regex( $html ) { $result = preg_match( '/(?><body[^>]*>)(?>.*?<\/body>)/is', $html, $matches ); if ( ! $result ) { return $html; } return $this->add_hash_to_element_regex( $html, $matches[0], 2 ); } private function add_hash_to_element_regex( $html, $element, $depth ) { if ( $depth < 0 ) { return $html; } $skip_tags = [ 'div', 'main', 'footer', 'section', 'article', 'header', ]; $result = preg_match_all( '/(?><(' . implode( '|', $skip_tags ) . ')[^>]*>)/is', $element, $matches, PREG_SET_ORDER ); if ( ! $result ) { return $html; } $count = 0; foreach ( $matches as $child ) { $opening_tag_html = strstr( $child[0], '>', true ) . '>'; $hash = md5( $opening_tag_html . $count ); ++$count; $replace = preg_replace( '/' . $child[1] . '/is', '$0 data-rocket-location-hash="' . $hash . '"', $child[0], 1 ); $html = preg_replace( '/' . preg_quote( $child[0], '/' ) . '/', $replace, $html, 1 ); } return $html; } public function add_locations_hash_to_html_simple_html_dom( $html ) { $dom = HtmlDomParser::str_get_html( $html ); $body = $dom->getElementByTagName( 'body' ); if ( $body instanceof SimpleHtmlDomBlank ) { return $html; } $this->add_hash_to_element_simple_html_dom( $body, 2 ); return $dom->save(); } private function add_hash_to_element_simple_html_dom( $element, $depth ) { if ( $depth < 0 ) { return; } $skip_tags = [ 'DIV', 'MAIN', 'FOOTER', 'SECTION', 'ARTICLE', 'HEADER', ]; static $count = 0; foreach ( $element->childNodes() as $child ) { if ( ! in_array( strtoupper( $child->getTag() ), $skip_tags, true ) ) { continue; } $child_html = $child->html(); $opening_tag_html = strstr( $child_html, '>', true ) . '>'; $hash = md5( $opening_tag_html . $count ); ++$count; $child->setAttribute( 'data-rocket-location-hash', $hash ); $this->add_hash_to_element_simple_html_dom( $child, $depth - 1 ); } } }- Create
WP_Rocket\Engine\Optimization\LazyRenderContent\Frontend\Subscriber - Add the
rocket_bufferevent with callback tocreate_hashmethod in the controller, we can set priority to 16/17
- Create a method maybe
-
ServiceProvider and Subscriber Integration:
- Update the
ServiceProviderto register the new lazy-render-content controller and Subscriber. - Configure the
Subscriberclass to use the new controller and processor logic:
- Update the
-
Testing and Validation:
- Test the new controller to ensure it applies lazy-rendering correctly using all three methods without causing performance issues.
- Validate that the
data-rocket-location-hashattribute is added appropriately to the target elements. - Compare the outputs of the three methods to determine the best approach.
For more detailed implementation and reference, please check the prototype PR.
@Miraeld @jeawhanlee While this is an important first step, this is far from completing the front-end part of the feature. This issue seems mostly to be integrating the prototype in a suitable structure for 3.17. This might need more details such as:
- those methods must not be called if LCR is not activated (based on activation/context?)
Here are the missing points that would probably need dedicated issues for follow-up:
- When LCR data is available for this URL and screensize, then the controller must:
- apply the location hash (as per this issue)
- look for added location hash matching the ones in the DB and replace them by the lazy-render attribute (see here )
- remove remaining location hashes
Can you create this issue and groom it?
@Miraeld @jeawhanlee @wp-media/qa-team I think the AC for this one are exactly the ones for the prototype we already have, right?
@MathieuLamiot I think we can add the missing parts here as I don't think it's worth opening a dedicated git issue, WDYT?
@jeawhanlee Thanks for raising the point. I feel like the missing part (applying the lazy render attribute) will be a complex thing, requiring a few days of work. On the other hand, if we have a frontend controller that only does the part where there is no data in the DB (adding the hashes), then it unlocks many dependencies: we would already be able to test data generation end-to-end: injecting beacon + hashes and getting the hashes in DB after a visit.
So, to avoid having everything blocked because we are waiting for the frontend controller to be completed before we start testing end-to-end, I'd advise to go with a dedicated GH issue.
The controller shouldn't be using anything else than DOMDocument right? The code example have use of simplehtmldom, which seems unexpected.
I discussed this with @jeawhanlee and @Miraeld: the longer we can keep the 3 options (DOMDocument, Regex, SimpleDom) the better as it would allow us to switch to another if we discover a limitation further down the road. However, if maintaining multiple solutions for a few weeks adds complexity/development time, we can drop Regex & SimpleDom already. It looked like we could keep them easily for now by just copy-pasting your prototype.