solr icon indicating copy to clipboard operation
solr copied to clipboard

SOLR-17419: Introduce ParallelHttpShardHandler

Open gerlowskija opened this issue 6 months ago • 0 comments

https://issues.apache.org/jira/browse/SOLR-17419

Description

The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead. This is especially stark when auth is enabled, and PKI header-generation happens at request-sending time.

Solution

This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections.

(I have some really promising perf test results I'll share soon - see SOLR-17149 for more discussion on that front.)

Remaining TODOs:

  • tests for ParallelHttpShardHandler
  • precommit/check
  • ref-guide docs for shard handler abstraction
  • test randomization for http vs parallel SH

Tests

Still TBD

Checklist

Please review the following and check all that apply:

  • [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [ ] I have created a Jira issue and added the issue ID to my pull request title.
  • [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • [ ] I have developed this patch against the main branch.
  • [ ] I have run ./gradlew check.
  • [ ] I have added tests for my changes.
  • [ ] I have added documentation for the Reference Guide

gerlowskija avatar Aug 29 '24 19:08 gerlowskija