Refine indexing pressure accounting in semantic bulk inference filter
In #125517, we estimated that inference results would double the document _source size since they are pooled by the bulk action. This PR reduces the memory needed to perform the update by reusing the original source array when possible. This way we can only account for the extra inference fields and reduce the overall indexing pressure.
Additionally, this PR introduces a new counter in InferenceStats to track the number of rejections caused by indexing pressure from inference results.
Pinging @elastic/search-eng (Team:SearchOrg)
Pinging @elastic/search-relevance (Team:Search - Relevance)
Hi @jimczi, I've created a changelog YAML for you.