Use ElasticPress on WP's core sitemap creation
Is your enhancement related to a problem? Please describe.
For big sites, sitemap creation could be resource-intensive. Leveraging ElasticPress in that process could speed things up.
Describe the solution you'd like
The initial discovery shows that code for posts could be as simple as something like:
add_filter(
'wp_sitemaps_posts_query_args',
function( $args ) {
$args['ep_integrate'] = true;
return $args;
}
);
Similar solutions could be achieved for users and terms if those features are active.
Props @tott
@felipeelia
Very interesting, thanks for posting about this!
Using the code you provided, we received Query Response Code: HTTP 0 in the ElasticPress Query Log. Too many posts was the issue initially as when we went down to 250 or so, it worked.
Filtering wp_sitemaps_posts_pre_url_list allowed us to make more modifications. Switching fields from all to ids helped us return a response with a larger number of posts_per_page.
However this means that we need to use get_permalink() and get_post_field() to build the sitemap data, though I'm not certain of the implications of doing both of those x1000. That said, the official WordPress code uses get_permalink() so maybe it isn't a big deal.
add_filter('wp_sitemaps_posts_pre_url_list', 'os_sitemap_callback', 10, 3);
function os_sitemap_callback( $derp, $post_type, $page_num) {
$per_page = 1000;
$offset = ($per_page * $page_num) - $per_page;
$args = array(
'orderby' => 'modified',
'order' => 'ASC',
'post_type' => $post_type,
'posts_per_page' => 1000,
'offset' => $offset,
'post_status' => array( 'publish' ),
'no_found_rows' => true,
'update_post_term_cache' => true,
'update_post_meta_cache' => true,
'ignore_sticky_posts' => true,
'fields' => 'ids',
// 'fields' => 'all',
'ep_integrate' => true,
);
$query = new WP_Query( $args );
$url_list = array();
if ( $post_type === 'post' && $page_num === 1 ) :
// add the homepage on the first page
$sitemap_entry = array(
'loc' => home_url( '/' ),
);
$url_list[] = $sitemap_entry;
endif;
if ($query->posts) :
// display if elasticpress query was working since debug bar doesn't load on a sitemap
// can be removed
$sitemap_success = array(
'loc' => $query->elasticsearch_success,
);
$url_list[] = $sitemap_success;
foreach ( $query->posts as $post ) {
$sitemap_entry = array(
'loc' => get_permalink($post),
'lastmod' => get_post_field('post_modified_gmt', $post),
// 'loc' => $post->permalink,
// 'lastmod' => $post->post_modified_gmt,
);
$url_list[] = $sitemap_entry;
}
endif;
return $url_list;
}