OpenSearch [BUG] Significant performance degradation with Scroll API in OpenSearch versions >= 2.6

Describe the bug

We noticed a significant performance hit while using the Scroll API to scroll a large volume of data using version 2.11.0. A process that previously took a couple of hours with version 2.3.0 is now taking several more hours after upgrading to 2.11.0. We ran some tests which indicate this performance change to have been introduced in 2.6.0. This is also easily reproducible across various environments including a cluster with multiple master and data nodes, as well as a local system running a single node instance. As seen in the attached graphs, that prior to 2.6.0 the scroll API has a relatively flat response time and starting with 2.6.0 the response time increases (linearly?) to some level and stays at that level thereafter. Overall the response time for 2.6+ is worse off than prior versions.

Related component

Search:Performance

To Reproduce

Test Machine Details

MacBook Pro M3 with 36GB RAM and 500GB of storage running MacOS 14.6.1
JDK-17.0.12

Setup

Install GNU Coreutils required for the gdate and ghead commands (Required only on MacOS)

$ brew install coreutils
Clone the opensearch repository and checkout the tag for the version to test.

$ git checkout -b 2.5.0 tags/2.5.0
Build the distribution

$ ./gradlew assemble -x test

Extract the relevant distribution for your platform. In my case, I used the no-jdk-darwin-arm64-tar bundle.

cluster.name: local-cluster
node.name: node1
path.data: /tmp/data
path.logs: /tmp/logs
path.repo: /tmp/backup
network.host: 0.0.0.0
http.port: 10200
transport.port: 10300
cluster.initial_cluster_manager_nodes: node1
bootstrap.system_call_filter: false

Test

Index about 40 million records into OpenSearch (using opensearch_data_load.py, requirements.txt).

$ python3 -m venv .testenv
$ source ./.testenv/bin/activate
$ pip3 install -r requirements.txt
$ python opensearch_data_load.py

Run the test script (perftest.sh) that scrolls through the dataset. We used a batch size of 10,000 to scroll which was most optimal for our use case. If running on Linux, replace instances of gdate with date in the script.

$ ./perftest.sh 10000 perf_test > testrun.log
Extract the response times from the log file into a charting tool like MSExcel to see the trend. Use the head command instead of ghead if running on Linux.

$ tail -n +2 testrun1.log | ghead -n -3 | awk '{print $(NF-1)}'
Repeat the above Setup and Test for 2.6.0 and other OpenSearch versions.

Expected behavior

Our expectation is that the Scroll API performance should remain consistent across different versions, but especially not get any worse.

Additional Details

Plugins No plugins installed

Screenshots ScrollResponse-OpenSearch-2 3 0 ScrollResponse-OpenSearch-2 4 0 ScrollResponse-OpenSearch-2 5 0 ScrollResponse-OpenSearch-2 6 0 ScrollResponse-OpenSearch-2 7 0 ScrollResponse-OpenSearch-2 11 0 ScrollResponse-OpenSearch-2 16 0

Host/Environment (please complete the following information):

MacBook Pro M3 with 36GB RAM and 500GB of storage running MacOS 14.6.1
JDK-17.0.12

Additional context Test Scripts to load and test are attached scripts.zip

Oct 10 '24 00:10 wimroy

Thanks for reporting this. Since you have a clear repro, try bisecting this down a change and running tests against a local environment launched via ./gradlew run?

Oct 10 '24 15:10 dblock

@dblock - Change that introduced the performance regression is in this commit 4aeae87c25b in 2.6.0.

See the attached charts below for 4aeae87c25b and the previous commit cb42bb2a1f7.

ScrollResponse-OpenSearch-2 6 0-cb42bb2a1f7ff941cf7e239d72e73f681102a0ce

Looking at the change in 4aeae87c25b, though, I'm not so sure it's a bug, but more of a documentation issue. We were able to work around this issue by adding a sort criteria to sort by _doc in Ascending order to the initial Search request created before we start Scrolling. I didn't find any reference in the OpenSearch docs on using _doc to scroll (apologies if it's documented and I missed it). I did, however, find a note in the Elasticsearch docs.

_doc has no real use-case besides being the most efficient sort order. So if you don’t care about the order in which documents are returned, then you should sort by _doc. This especially helps when scrolling.

Modifying my initial scroll query to this gave improved response times. See the additional graphs tested with 2.11.0 for scrolling with and without the sort criteria.

'{
  "size": '"$batch_size"',
  "_source": false,
  "version": true,
  "explain": false,
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "_doc": {
        "order": "asc"
      }
    }
  ]
}'

ScrollResponse-OpenSearch-2 11 0-no-sort_doc

Oct 14 '24 04:10 wimroy

[Search Triage] @gashutos Can you please verify if this is related to the optimization you did?

Oct 16 '24 16:10 sandeshkr419

@rishabh6788 Do we have any operation in our nightly benchmarks that can help verify this going forward? cc: @rishabhmaurya

Oct 16 '24 16:10 getsaurabh02

We do have scroll operation running for big5 workload in our nightlies but the oldest version data that we have is for OS-2.7 version. Before that we have it for 1.x line.

Graph Link: https://s12d.com/nWqMHevE This is on single-node cluster with 1-shard-0-replica configuration. The corpus size is 100G, index size if ~24G (116M records).

Oct 16 '24 18:10 rishabh6788

We were able to work around this issue by adding a sort criteria to sort by _doc in Ascending order to the initial Search request created before we start Scrolling.

Maybe we should just do that by default. If someone does a scroll without specifying their own sort criteria, we can/should sort by _doc ascending.

Oct 16 '24 22:10 msfroh