ElasticPress icon indicating copy to clipboard operation
ElasticPress copied to clipboard

Use ElasticPress on WP's core sitemap creation

Open felipeelia opened this issue 3 years ago • 1 comments

Is your enhancement related to a problem? Please describe.

For big sites, sitemap creation could be resource-intensive. Leveraging ElasticPress in that process could speed things up.

Describe the solution you'd like

The initial discovery shows that code for posts could be as simple as something like:

add_filter(
	'wp_sitemaps_posts_query_args',
	function( $args ) {
		$args['ep_integrate'] = true;
		return $args;
	}
);

Similar solutions could be achieved for users and terms if those features are active.

Additional context

Props @tott

felipeelia avatar Mar 10 '22 11:03 felipeelia

@felipeelia

Very interesting, thanks for posting about this!

Using the code you provided, we received Query Response Code: HTTP 0 in the ElasticPress Query Log. Too many posts was the issue initially as when we went down to 250 or so, it worked.

Filtering wp_sitemaps_posts_pre_url_list allowed us to make more modifications. Switching fields from all to ids helped us return a response with a larger number of posts_per_page.

However this means that we need to use get_permalink() and get_post_field() to build the sitemap data, though I'm not certain of the implications of doing both of those x1000. That said, the official WordPress code uses get_permalink() so maybe it isn't a big deal.

add_filter('wp_sitemaps_posts_pre_url_list', 'os_sitemap_callback', 10, 3);

function os_sitemap_callback( $derp, $post_type, $page_num) {

    $per_page = 1000;
    $offset = ($per_page * $page_num) - $per_page;

    $args = array(
        'orderby'                => 'modified',
        'order'                  => 'ASC',
        'post_type'              => $post_type,
        'posts_per_page'         => 1000,
        'offset'                 => $offset,
        'post_status'            => array( 'publish' ),
        'no_found_rows'          => true,
        'update_post_term_cache' => true,
        'update_post_meta_cache' => true,
        'ignore_sticky_posts'    => true, 
        'fields' => 'ids',
        // 'fields' => 'all',
        'ep_integrate' => true,
    );

    $query = new WP_Query( $args );
    
    $url_list  = array();

    if ( $post_type === 'post' && $page_num === 1 ) :

        // add the homepage on the first page
        $sitemap_entry = array(
            'loc' => home_url( '/' ),
        );

        $url_list[]    = $sitemap_entry;

    endif;

    if ($query->posts) :

        // display if elasticpress query was working since debug bar doesn't load on a sitemap
        // can be removed
        $sitemap_success = array(
            'loc' => $query->elasticsearch_success,
        );
        $url_list[] = $sitemap_success;

        foreach ( $query->posts as $post ) {

            $sitemap_entry = array(
                'loc' => get_permalink($post),
                'lastmod' => get_post_field('post_modified_gmt', $post),
                // 'loc' => $post->permalink,
                // 'lastmod' => $post->post_modified_gmt,
            );

            $url_list[]    = $sitemap_entry;
        }

    endif;

    return $url_list;
}

stevesearer avatar Feb 03 '23 01:02 stevesearer