elasticsearch-py icon indicating copy to clipboard operation
elasticsearch-py copied to clipboard

Misleading helpers.scan preserve_order documentation.

Open kramarz opened this issue 7 years ago • 1 comments

I need to 'scan' with 'preserve_order' and it works well. However, documentation warns that setting it to true makes

don’t set the search_type to scan - this will cause the scroll to paginate with preserving the order. Note that this can be an extremely expensive operation and can easily lead to unpredictable results, use with caution.

https://elasticsearch-py.readthedocs.io/en/master/helpers.html#elasticsearch.helpers.scan

That really confused me for a moment.

There is no such a thing like search_type=scan since elasticsearch >2 and in the code there is no trace of pagination. In addition in most recent documentation of ES there is no trace of "unpredicted results" - It just says that ordering by anything other than _doc makes it slower. https://www.elastic.co/guide/en/elasticsearch/reference/6.4/search-request-scroll.html

Suggested documentation Don't set sort to _doc. Note this will make this operation slower.

kramarz avatar Aug 29 '18 12:08 kramarz

The documentation should note that providing a query with a sort specified and preserve_order=False will clobber the existing sort.

It might be a good idea to check whether query includes a sort when preserve_order is False, and issue a warning if an existing sort is being clobbered. It took a while to realize why my sort specification wasn't being honored.

freddrake avatar Nov 22 '19 20:11 freddrake

Closing this old issue. Please reopen if still relevant with recent client software.

technige avatar Dec 06 '22 09:12 technige