Extend size_limit setting in query engine to support unlimited index query.
Problem statements
Currently, the query.size_limit setting configure the maximum amount of documents to be pull from OpenSearch. The default value is: 200. for example, Let's say size_limit = 200, and index has 10K docs.
- expected, return 200 rows.
source=index - expected return 1 docs
source=index | head 1 - return 200 rows, it is not user expected.
source=index | head 11000
Proposal
The query.size_limit configure the maximum amount of rows returned by query. The default value is: 200. size_limit must larger than 0. If the query has head(PPL) or limit(SQL). it will override the query.size_limit setting.
Expectation of search query .
- should return 200 rows.
source=index - should return 1 rows.
source=index | head 1 - should return 11000 rows.
source=index | head 11000
Expectation of aggregation query.
- should return 200 rows.
source=index | stats request, count(*) by request - should return 11000 rows.
source=index | stats request, count(*) by request | head 11000
Probably another way of thinking about this: the size_limit setting is just for default behavior. If users specify a larger number by head or LIMIT, that means they're aware of what they're doing and just want to override the default limit value. This may be safer than setting size limit to -1 and user run a query without head command later?
Probably another way of thinking about this: the
size_limitsetting is just for default behavior. If users specify a larger number byheadorLIMIT, that means they're aware of what they're doing and just want to override the default limit value. This may be safer than setting size limit to -1 and user run a query without head command later?
Update the proposal as discussed.
Design
OpenSearch Request
Request Operators
Non Aggregation Query
Interface to the OpenSearch engine used by the OpenSearchIndexScan physical plan
OpenSearchQueryRequestThe default request operator.OpenSearchScrollRequestThis is used if query size exceeds theindex.max_result_windowsetting. It invokes scroll requests to OpenSearch and fetches results in batches.
Aggregation Query
There's no scroll request for aggregation queries in OpenSearch.
For a composite (group by) aggregation query, the response contains a keyAfter field, which can be used in the next request to fetch the next buckets.
Request Builder
OpenSearchRequestBuilder builds OpenSearchQueryRequest or OpenSearchScrollRequest, depending on whether scrolling is needed.
Physical Plan Implementation
- Get
index.max_result_windowfor indices. - Initializes
OpenSearchIndexScan, which containsOpenSearchRequestBuilder - Visit logical plan with index scan as context, so logical operators visited will accumulate (push down) OpenSearch query and aggregation DSL on index scan. The operations are pushed down to the request builder.
Index Scan Execution
- Build the request upon
plan.open() - Fetch the results in batches of size
maxResultWindow - When
plan.close(), clean up the cursor and context in OpenSearch engine if request type isOpenSearchScrollRequest
Remaining issues:
- Extend query size limit for aggregation query requests
- Described as follows
Here we assume
query.size_limit = 200
index.max_result_window = 10000
These work as expected:
source=indexreturns 200 rowssource=index | head 1returns 1 rowsource=index | head 300returns 300 rowssource=index | head 11000returns 11000 rows using scrollsource=index | fields a,breturns 200 rowssource=index | fields a,b | head 1returns 1 row
But these don't:
source=index | fields a,b | head 300returns 200 rowssource=index | fields a,b | head 11000returns 200 rows
The reason being that limit is only pushed down to index scan if they're optimized and merged into a single node. In these two cases the index scan has query size 200 (query.size_limit).
Solution
Option 1
Better logical plan optimization so that the Project logical plan node doesn't block optimization for other plan nodes. Project isn't merged with Relation / Index Scan, and thus stops Limit from merging with Relation / Index Scan
One note on the performance. With this feature, there's no limitation on the size of the query result anymore, so it's possible that a single request-response cycle take too long and timeout.