solr
solr copied to clipboard
SOLR-17419: Introduce ParallelHttpShardHandler
https://issues.apache.org/jira/browse/SOLR-17419
Description
The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead. This is especially stark when auth is enabled, and PKI header-generation happens at request-sending time.
Solution
This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections.
(I have some really promising perf test results I'll share soon - see SOLR-17149 for more discussion on that front.)
Remaining TODOs:
- tests for ParallelHttpShardHandler
- precommit/check
- ref-guide docs for shard handler abstraction
- test randomization for http vs parallel SH
Tests
Still TBD
Checklist
Please review the following and check all that apply:
- [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
- [ ] I have created a Jira issue and added the issue ID to my pull request title.
- [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
- [ ] I have developed this patch against the
main
branch. - [ ] I have run
./gradlew check
. - [ ] I have added tests for my changes.
- [ ] I have added documentation for the Reference Guide