[5.x] Performance Optimisation for Queries - Optimise IteratorBuilder limit queries to avoid loading all items
Description
This PR optimises the IteratorBuilder to avoid loading all items when a limit() is applied without orderBy() or inRandomOrder(). This significantly improves search query performance, particularly for sites with large datasets.
Fixes https://github.com/statamic/cms/issues/13215
Problem
Previously, IteratorBuilder::getFilteredItems() loaded ALL items before applying limits. For a ->limit(10) query on 10,000 results, it would hydrate all 10,000 items before taking the first 10.
Solution
Added a new abstract method getBaseItemsLazy(): Generator that yields items lazily, enabling early termination when the limit is reached.
Three optimisation paths in getFilteredItems():
- No limit, orderBy, or randomise: Falls back to loading all items (unchanged behaviour)
- Limit without wheres: Loads only
offset + limititems - Limit with wheres: Batches items and stops early when enough matches are collected
Benchmark Results
I wrote a simple benchmark script separately to test the difference in performance between current code and new code for various scenarios (eg wheres, where + limit, limit etc.). I skipped orderBy as this remains the same, but I did test just in case and it's the same.
| Scenario | Limit | Old Time | Old Hydrated | New Time | New Hydrated | Improvement |
|---|---|---|---|---|---|---|
| No wheres | 10 | 0.70ms | 10,000 | 0.04ms | 10 | 99.9% fewer |
| No wheres | 50 | 0.68ms | 10,000 | 0.05ms | 50 | 99.5% fewer |
| No wheres | 100 | 0.68ms | 10,000 | 0.06ms | 100 | 99% fewer |
| 50% match rate | 10 | 9.13ms | 10,000 | 0.10ms | 50 | 99.5% fewer |
| 50% match rate | 50 | 9.14ms | 10,000 | 0.15ms | 100 | 99% fewer |
| 50% match rate | 100 | 9.12ms | 10,000 | 0.26ms | 200 | 98% fewer |
| 10% match rate | 10 | 9.01ms | 10,000 | 0.18ms | 100 | 99% fewer |
| 10% match rate | 50 | 8.99ms | 10,000 | 0.57ms | 500 | 95% fewer |
| 10% match rate | 100 | 8.98ms | 10,000 | 1.08ms | 1,000 | 90% fewer |
Safety checks
- Queries with
orderBy()orinRandomOrder()load all items (sorting/shuffling requires all) - Queries without
limit()load all items (no early termination possible) - Results are identical to previous behaviour, just faster
Files Changed
src/Query/IteratorBuilder.php- Core optimisation logicsrc/Query/ItemQueryBuilder.php- ImplementsgetBaseItemsLazy()src/Search/QueryBuilder.php- ImplementsgetBaseItemsLazy()with batch hydration
Tests
tests/Query/IteratorBuilderTest.php- 13 tests covering optimisation pathstests/Search/QueryBuilderPerformanceTest.php- 14 tests for search-specific behaviourtests/Fakes/Query/TestIteratorBuilder.php- Test helpertests/Fakes/Query/HydrationTrackingQueryBuilder.php- Test helper
Note: These tests all pass, it's just the UTF-8 which Duncan is fixing in another PR that are failing.
Potential Future Optimisation
This PR optimises hydration by stopping early once the limit is reached. However, the search drivers still fetch all raw results from the index before we apply the limit.
A possible future optimisation could pass the limit down to getSearchResults($query, $limit = null) so drivers can fetch fewer results at the source:
- Algolia: Could use the
hitsPerPageAPI parameter to request fewer hits - Comb: Could limit raw results before mapping scores/snippets
This would be most beneficial for large indexes where the initial lookup is expensive.
In my testing, this PR still reduces my search from 3.5s down to 600ms using Comb, and with this future optimisation would reduce down to about 400ms.