Post type sitemap might be empty due to "wpseo_exclude_from_sitemap_by_post_ids" filter
- [x] I've read and understood the contribution guidelines.
- [x] I've searched for any related issues and avoided creating a duplicate issue.
Please give us a description of what happened.
The methods get_index_links and get_posts from file inc/sitemaps/class-post-type-sitemap-provider.php are ignoring excluded posts IDs from sitemap added by filter wpseo_exclude_from_sitemap_by_post_ids during calculation of number of posts which is need to be in sitemaps. Generally speaking, that filter works fine, and posts are excluded from sitemaps BUT in that case, sitemaps might have a lower number of links that defined in $max_entries variable (in my case - 1000). Some sitemaps might have fewer links than expected and I have even one empty sitemap (if all posts from that sitemap are excluded) which isn't good:
For example, /post-sitemap1.xml is empty :
But /post-sitemap2.xml contains 223 URLs.
Please describe what you expected to happen and why.
The method get_posts might be fixed by using filters wpseo_typecount_where and wpseo_posts_where, because it has support for it. Example:
function get_excluded_wpseo_where($sql, $post_type) {
if ($post_type === 'post') {
$args = [
/* some query arguments */
'fields' => 'ids'
];
$result = new WP_Query($args);
$sql = ' AND ID NOT IN (' . implode(',', $result->posts) . ')';
}
return $sql;
}
add_filter('wpseo_typecount_where', 'get_excluded_wpseo_where', 10, 2);
add_filter('wpseo_posts_where', 'get_excluded_wpseo_where', 10, 2);
It would be nice to have the same functionality for the next piece of code from get_index_links method. For example, in the same way as get_posts method:
$where_filter = apply_filters( 'wpseo_index_links_where', false, $post_type );
...
$sql = "
SELECT post_modified_gmt
FROM ( SELECT @rownum:=0 ) init
JOIN {$wpdb->posts} USE INDEX( type_status_date )
WHERE post_status IN ('" . implode( "','", $post_statuses ) . "')
AND post_type = %s
AND ( @rownum:=@rownum+1 ) %% %d = 0
{$where_filter}
ORDER BY post_modified_gmt ASC
";
But adding that filter will only allow extending SQL request but it doesn't solve the problem described above. Anyway, I would be happy even if only that described solution will be realized.
Please inform the customer of conversation # 1059186 when this conversation has been closed.
Please inform the customer of conversation # 1084791 when this conversation has been closed.
Please inform the customer of conversation # 1158899 when this conversation has been closed.
Please inform the customer of conversation # 1222507 when this conversation has been closed.
As discussed via Ticket already:
Google Search Console (GSC) marks empty Sitemap files as "Error".
Based on an already closed ticket ( https://github.com/Yoast/wordpress-seo/issues/16771 ) the decision to keep empty sitemaps was made back in 2021 because of GSC marking deleted but previously existing Sitemaps as an Error and probably not Empty ones. This does not apply anymore. Seems like Empty and Deleted Sitemaps are at least equally "not good". Maybe the behavior has even flipped.
Removing empty sitemaps and all the code that is necessary to keep up the relations between posts and specific sitemaps should speed up the rendering of the sitemaps as well.
Please inform the customer of conversation # 1300158 when this conversation has been closed.